CN110096597A - A kind of text TF-IDF feature reconstruction method of combination emotional intensity - Google Patents
A kind of text TF-IDF feature reconstruction method of combination emotional intensity Download PDFInfo
- Publication number
- CN110096597A CN110096597A CN201910224082.0A CN201910224082A CN110096597A CN 110096597 A CN110096597 A CN 110096597A CN 201910224082 A CN201910224082 A CN 201910224082A CN 110096597 A CN110096597 A CN 110096597A
- Authority
- CN
- China
- Prior art keywords
- word
- text
- idf
- emotional intensity
- dictionary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/374—Thesaurus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/247—Thesauruses; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The present invention relates to a kind of text TF-IDF feature reconstruction methods of combination emotional intensity, expression and user name are extracted and divided by canonical matching process, positional relationship according to intensity dictionary and negative word, degree adverb, repetitor is modified word intensity, new word is replaced by the near synonym replacement method based on Word2Vec, so that the TF-IDF feature vector to text is reconstructed.Compared with prior art, the present invention considers situations such as negative word, degree adverb, repetitor, is modified to the TF-IDF feature of word, retains the information such as intensity, the position of word;With the new word on the ripe word replacement test collection occurred in training set, enhance Generalization Capability;It can not need to be segmented manually directly using former sentence as input when using.
Description
Technical field
The invention belongs to the classification fields in natural language processing, are related to a kind of text classification preprocess method, especially
It is related to a kind of text TF-IDF feature reconstruction method of combination emotional intensity.
Background technique
Instantly the reverse text frequency (Term of word frequency-is commonly used in natural language processing and machine learning field
Frequency-Inverse Document Frequency, abbreviation TF-IDF) construction obtain the feature vector of text.With microblogging
Netspeak for representative includes that special languages ingredient, the existing methods such as expression, user name are not handled them, is caused
Information is obscured;The elements such as negative word, degree adverb, dittograph in Chinese text will have a direct impact on the emotional intensity of text
With polarity, the feature vector that existing method obtains can not retain these information, cause the misalignment of information;In test set and practical fortune
Some new words not in training set in, existing method can give up them, cause the loss of information.
Summary of the invention
It is an object of the present invention to overcome the above-mentioned drawbacks of the prior art and provide a kind of text emotions to analyze
Preprocess method is extracted and is divided to expression and user name by canonical matching process, according to intensity dictionary and negative word,
Degree adverb, repetitor positional relationship word intensity is modified, pass through the near synonym replacement method based on Word2Vec
New word is replaced, so that the TF-IDF feature vector to text is reconstructed.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of text TF-IDF feature reconstruction method of combination emotional intensity, comprising the following steps:
S1 is constructed and is deactivated dictionary, degree dictionary and negative dictionary, and the word in the degree dictionary is strong with emotion
The degree adverb of grade is spent, the word in the negative dictionary is negative word;
S2 obtains text to be analyzed, is multiple clauses by text segmentation using punctuation mark as separation;
S3 traverses each word in clause and records the number and position that they occur, leaves out stop words therein, right
The subsequent word of degree adverb carries out emotional intensity amendment, and the overturning of feeling polarities is carried out to the subsequent word of negative word;
S4 creates a blank dictionary to each section of text to be analyzed, is indexed with word, strong with the emotion of word
Degree, quantity make key assignments, traverse each word, if current term is off word, degree adverb or negative word, skip the word
Any operation is not done;If not including current term in existing dictionary, which is deposited into dictionary;If deposited in dictionary
In current term, then the emotional intensity and quantity of corresponding word in dictionary are updated;
S5 extracts the TF-IDF characteristic value of text, respectively by the TF-IDF value of each word and emotion corresponding in dictionary
Intensity is multiplied, the characteristic value after being reconstructed:
TF-IDFnew,w=TF-IDFw×degw
Wherein, TF-IDFnew,wFor the TF-IDF characteristic value of the word w after reconstruct, TF-IDFwFor the original TF- of word w
IDF characteristic value, degwFor the emotional intensity of word w.
The deactivated dictionary includes English character, number and mathematical character.
The text to be analyzed is the microblogging text comprising user name and expression, in the step S2, is used first
Canonical matching method in text user name (text after@symbol) and expression (text in [] symbol) matched and mentioned
It takes, they is distinguished with plain text, the influence to avoid the word in them with Sentiment orientation to the emotion of whole text.
In the step S2, the separation of each clause is punctuation mark.
The punctuation mark does not include pause mark, quotation marks, dash, single quotation marks and colon.
In the step S3, the emotional intensity calculation formula of word are as follows:
Wherein, degwFor the emotional intensity of word w, m degree adverb, n negative word are had before the word, pow is
The intensity value of degree adverb.
In the step S3, if there is the case where negative word is before degree adverb, the emotion of corresponding word
Intensity amendment are as follows:
This method first constructs a list, all words occurred when for storing trained, in step S4 in initialization
In, word and list are compared, when word is the new word being not present in list, using near synonym replacement method, with column
The highest word of similarity replaces the new word in table.
In the step S4, after words all in text are all stored in dictionary, adding for the emotional intensity of word is also carried out
Power operation, specifically includes the following steps:
1) text is obtained divided by word frequency of occurrence Dict [w] [count] with total emotional intensity Dict [w] [deg] of word
The average emotional intensity of word w in this
2) total emotional intensity Dict [w] [deg] of word w is motivated: ifThen Dict [w] [deg]
It is updated to Dict [w] [deg]+degw+ M, ifThen Dict [w] [deg] is updated toWherein, M is excitation value.The real feelings of calculating word w after abbreviation are strong
Degree:
Wherein, M is excitation value.
Compared with prior art, the invention has the following advantages that
(1) degree dictionary is constructed, amendment is weighted by word intensity is modified by degree word in sentence, continuously occurs multiple
Correction effect can be superimposed when degree word;The modified word intensity of negative word in sentence is carried out polarity reversion by building negative dictionary,
Inversion effect can be superimposed when continuously there are multiple negative words.
(2) former sentence is segmented by morphemes such as punctuation mark, user name, non-morpheme words, negative word and degree adverb are repaired
Positive interaction only in section effectively, to the emotion ambiguity for avoiding long sentence, complicated sentence pattern from being easy to generate, the counter productives such as mix.
(3) user name, the expression in text are matched and is extracted using canonical matching method, by it and plain text word
Language is distinguished, and is avoided information and is obscured.
(3) with the new word on the ripe word replacement test collection occurred in training set, enhance Generalization Capability.
(4) it can not need to be segmented manually directly using former sentence as input when using.
(5) situations such as considering negative word, degree adverb, repetitor, is modified the TF-IDF feature of word, retains word
The information such as intensity, the position of language.
Detailed description of the invention
Fig. 1 is the present embodiment method flow schematic diagram;
Fig. 2 (a), 2 (b) are an example, wherein Fig. 2 (a) and Fig. 2 (b) is respectively to use conventional method and the method for the present invention
TF-IDF feature is extracted, and result is subjected to visual histogram;
Fig. 3 (a), 3 (b) are another example, wherein Fig. 3 (a) and Fig. 3 (b) is respectively to use conventional method and side of the present invention
Method extracts TF-IDF feature, and result is carried out visual histogram;
Fig. 4 is the hyperplane schematic diagram of conventional method;
Fig. 5 is the hyperplane schematic diagram after the present embodiment feature reconstruction.
It by training set, 500 texts of 10000 texts is test training that Fig. 6, which is conventional method and the present embodiment method,
The performance comparison of the svm classifier model got.
Specific embodiment
The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.The present embodiment is with technical solution of the present invention
Premised on implemented, the detailed implementation method and specific operation process are given, but protection scope of the present invention is not limited to
Following embodiments.
Embodiment
A kind of text TF-IDF feature reconstruction method of combination emotional intensity, build first deactivated dictionary, degree dictionary and
It negate dictionary.Stop words mainly includes English character, number, mathematical character, punctuation mark and the extra-high Chinese word character of frequency of use
Deng, such as " you ", " I ", " because ", " and " etc.;Degree dictionary includes a series of for modifying adjective and the strong journey of adverbial word
The adverbial word of degree emotion intensity corresponding with them selects word " the degree grade that more comprehensively Hownet was issued in 2007
Other word dictionary " be used as degree dictionary, and by degree adverb therein be divided into " extremely ", " super ", " very ", " compared with ", " slightly ",
Six ranks such as " deficient " respectively correspond 1.7,1.5,1.3,1.1,0.8 and 0.5 etc. six intensity;Negate dictionary include " no ",
The common negative word such as "No", " non-", " not ".As Fig. 1 be it is shown, this method includes following below scheme:
1. canonical matches
User name, the expression in text are matched and extracted using canonical matching method first, by it and plain text
Word is distinguished.As soon as being often matched to an expression or user name, position of their content with them in former sentence is recorded.
User name canonical matching expression: "@{ 1 } w { 1,30 } $ |@{ 1 } w { 1,30 } s ", meaning :@be followed by length be 1
To 30 character string (Chinese character or English), then connect user name full stop (end of the sentence symbol or space).
Microblogging expression regular expression: " [w { 1,5 }] ", meaning: the word for being 1-5 by the length that bracket " [] " surrounds
Symbol string.
2. punctuate segmentation (rough segmentation)
Using punctuation mark as separation, former sentence is divided into many clauses (rough segmentation), wherein pause mark, quotation marks, dash, list
The punctuation marks such as quotation marks, colon are not belonging to above-mentioned decollator scope because these punctuates will not interrupt sentence semantic logic and
The continuity of emotion.The modification of degree adverb, negative word in each clause only comes into force in clause, does not influence other sons
Sentence.By the step for, the rule of feature reconstruction below can be simplified to a certain extent, it is superfluous to avoid mixing for information
It is remaining.
3. information excavating
On the basis of previous step rough segmentation, (subdivision, such as the jieba participle in python are segmented to each clause
Library), it traverses each word in clause and records the number and position that they occur.Then leave out wherein all stop words,
The amendment of emotion degree is carried out to the subsequent word of degree adverb, the overturning of feeling polarities is carried out to the subsequent word of negative word.
If being modified word is w, there are m degree adverb, n negative word before it, the intensity value of degree adverb is pow.
" emotion degree " attribute calculation formula of so word w is as follows:
If the emotional intensity value of the word is all degree of the front there are multiple degree adverbs before modificand
The product of adverbial word intensity.That is, the effect of degree adverb can be superimposed;If there are multiple negatives before modificand
Word, then odd number negative word Overlay is equal to a negative word according to " two negatives make a positive " principle, even number negative word is folded
Effect is added to be equal to no negative word.
4. data correction
This step is modified the intensity value of word mainly for two kinds of special circumstances in Chinese.On the one hand, when one
A word had not only been denied word modification but also had been modified by degree adverb, needed to consider the relative positional relationship of degree adverb and negative word,
Such as: " I am not especially happy ", expression is positive emotion, and negative word " no " has modified " special ", so that the reinforcement of its script
Affectivity has become weakening, although whole emotion be still it is positive " happy ", its degree ratio " happy " is weaker.Therefore work as
There is the case where negative word is before degree adverb, negative inverse is sought the emotional intensity acquired in previous step, in word quilt
While being modified to positive emotion, its emotional intensity is also weakened.That is:
5. intensity weighted
Each section of text to be analyzed can all create a blank dictionary, be indexed with word, and intensity, the quantity of word are made
Key assignments.
All words are traversed again, if current vocabulary is off word, degree adverb or negative word, is skipped them and are not appointed
What is operated;If not including current vocabulary in existing dictionary, it is just deposited into dictionary;If having existed for working as in dictionary
Preceding vocabulary, then just update dictionary in equivalent " intensity " (deg) and " quantity " (count) two information, to it
While intensity, quantity are updated, give its Intensity attribute one additional excitation value M, to increase the emotion of dittograph
Intensity:
Dict [w] [deg] +=degw
Dict [w] [count] +=1
After all words of text have all been stored in dictionary, so that it may carry out the weighting of word intensity.The total of word is used first
Intensity Dict [w] [deg] seeks to obtain the mean intensity of word w in a document divided by word frequency of occurrence Dict [w] [count]:
Since the emotional intensity of identical word in the same text may be positive and may also be negative, so being weighted average
Intensity afterwardsBoth may be positive may also be negative, and it is strong give it while intensity, the quantity to it are updated
Attribute one additional excitation value M is spent, repeats intensity of the word in sentence to increase.In view of word identical in text
Emotional intensity, which may be positive, to be negative, if the word being negative to an emotional intensity adds a positive energize, can weaken it
Original emotion even results in the reversion of its polarity.So first calculating word mean intensityFurther according toPositive and negative add
The scheme of corresponding positive/negative excitation.
If it is N, i.e., last Dict [w] [count]=N, the then real feelings of word w that total degree, which occurs, in word w in text
Intensity:
With the increase of N,1 can be leveled off to, so excitationBeing can be with the increase of word frequency of occurrence N
And it approaches to saturation.That is some word occurs repeatedly in the text, its emotional intensity can be strengthened, but this reinforces
There is the upper limit, the effect reinforced every time can be more and more weaker with the increase of frequency of occurrence.Motivate the specific value of M by using
Person's decision, an adjustable parameter as this method.
6. new word is replaced
In this method initialization, a list can be constructed, all words occurred when for storing trained.Test and
When practice, word and the word in list can be compared, when occurring that new word is not present in list, this method is used
The near synonym replacement method (such as synonyms packet in python) of Word2Vec, compares two words in term vector space
Cosine similarity, new word is replaced with the highest ripe word of similarity in list, to enhance the Generalization Capability of model.
7. feature reconstruction
The TF-IDF of text is extracted by existing method (such as sklearn.TfidfVectorizer packet in python)
The TF-IDF value of each word is multiplied by feature with emotional intensity corresponding in dictionary respectively, the characteristic value after being reconstructed:
TF-IDFnew,w=TF-IDFw×Degw
TF-IDFnew,w: the TF-IDF characteristic value of the word w after reconstruct;
TF-IDFw: pass through the original TF-IDF characteristic value of the TfidfVectorizer word w acquired;
Degw: the emotional intensity value of word w.
In this way, just being reconstructed to the TF-IDF in feature vector, new feature vector has been obtained.
Fig. 2 and Fig. 3 is a specific example, and it is special to extract TF-IDF with conventional method and new method proposed by the present invention respectively
Sign, and result is visualized.By comparing it can be found that conventional method is lost the location information and degree letter of sentence
Breath causes two adversative text features the same;TF-IDF feature reconstruction method proposed by the present invention remains text
Location information and degree information, and they are embodied out in TF-IDF feature.
Fig. 4 and Fig. 5 is the TF-IDF feature reconstruction method effect diagram that this method proposes, Fig. 4 is the super flat of conventional method
Face schematic diagram, the schematic diagram after the feature reconstruction of the position Fig. 5.It can be seen that the degree adverb in conventional method loses repairing for they
Decorations effect, and individual features are retained after reconstruct, improve the accuracy of classification.
It is the property of svm classifier model that test set training obtains that Fig. 6, which is by training set, 500 texts of 10000 texts,
It can comparison.By comparing can see, in the various aspects such as accuracy rate, recall rate, precision ratio, F1 scoring, TF- proposed by the present invention
IDF feature reconstruction method will be better than conventional method.
The following are 4 examples comparatives to illustrate:
Example 1:@also like today miuky today you are very good-looking [happy] [happy]
It is compared with the traditional method, user name "@also likes miuky today " and expression " [happy] " correctly divide
It cuts.
Example 2: I does not like driving
Method | Feature vector |
Conventional method | Drive: 0.89 likes: 0.45 |
Method proposed by the present invention | Like: -0.46 drives: -0.89 |
It is compared with the traditional method, the feature vector after reconstruct remains the information of negative word " no ".
Example 3: I does not like driving very much
Method | Feature vector |
Method proposed by the present invention | Like: -0.79 drives: -1.51 |
Feature vector after reconstruct remains the strength information of degree adverb " very ".
Example 4: today, my heart very grief and indignation (explanation: comprising " sad " word in the model that training obtains, and did not included
" grief and indignation ")
Method | Feature vector |
Conventional method | Today: 0.86 today: 0.52 |
Method proposed by the present invention | Heart: 0.52 grief and indignation: 1.11 |
It is compared with the traditional method, new word " grief and indignation " can be identified, word information retains more complete.
Claims (10)
1. a kind of text TF-IDF feature reconstruction method of combination emotional intensity, which comprises the following steps:
S1 is constructed and is deactivated dictionary, degree dictionary and negative dictionary, and the word in the degree dictionary is with emotional intensity etc.
The degree adverb of grade, the word in the described negative dictionary is negative word;
S2 obtains text to be analyzed, is multiple clauses by text segmentation using punctuation mark as separation;
S3 traverses each word in clause and records the number and position that they occur, leaves out stop words therein, to degree
The subsequent word of adverbial word carries out emotional intensity amendment, and the overturning of feeling polarities is carried out to the subsequent word of negative word;
S4 creates a blank dictionary to each section of text to be analyzed, is indexed with word, with the emotional intensity of word, number
Amount makees key assignments, traverses each word, if current term is off word, degree adverb or negative word, skips the word and do not do
Any operation;If not including current term in existing dictionary, which is deposited into dictionary;If had existed in dictionary
Current term then updates the emotional intensity and quantity of corresponding word in dictionary;
S5 extracts the TF-IDF characteristic value of text, respectively by the TF-IDF value of each word and emotional intensity corresponding in dictionary
It is multiplied, the characteristic value after being reconstructed:
TF-IDFnew,w=TF-IDFw×degw
Wherein, TF-IDFnew,wFor the TF-IDF characteristic value of the word w after reconstruct, TF-IDFwIt is special for the original TF-IDF of word w
Value indicative, degwFor the emotional intensity of word w.
2. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 1, which is characterized in that institute
The deactivated dictionary stated includes English character, number and mathematical character.
3. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 1, which is characterized in that institute
The text to be analyzed stated is the microblogging text comprising user name and expression, in the step S2, uses canonical matching method first
To in text user name and expression matched and extracted, they are distinguished with plain text, to avoid in them band feelings
Influence of the word of sense tendency to the emotion of whole text.
4. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 3, which is characterized in that text
The text after the entitled symbol of user in this, expression are the text in [] symbol.
5. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 1, which is characterized in that institute
In the step S2 stated, the separation of each clause is punctuation mark.
6. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 5, which is characterized in that institute
The punctuation mark stated does not include pause mark, quotation marks, dash, single quotation marks and colon.
7. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 1, which is characterized in that institute
In the step S3 stated, the emotional intensity calculation formula of word are as follows:
Wherein, degwFor the emotional intensity of word w, m degree adverb, n negative word are had before the word, pow is degree pair
The intensity value of word.
8. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 7, which is characterized in that institute
In the step S3 stated, if there is the case where negative word is before degree adverb, the emotional intensity of corresponding word is corrected are as follows:
9. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 1, which is characterized in that should
Method first constructs a list, all words occurred when for storing trained, in step s 4, by word in initialization
It is compared with list, when word is the new word being not present in list, using near synonym replacement method, with similarity in list
Highest word replaces the new word.
10. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 1, which is characterized in that
In the step S4, after words all in text are all stored in dictionary, the weighting operations of the emotional intensity of word, tool are also carried out
Body the following steps are included:
1) it is obtained in text with total emotional intensity Dict [w] [deg] of word divided by word frequency of occurrence Dict [w] [count]
The average emotional intensity of word w
2) the real feelings intensity of word w is calculated:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910224082.0A CN110096597B (en) | 2019-03-22 | 2019-03-22 | Text TF-IDF characteristic reconstruction method combining emotion intensity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910224082.0A CN110096597B (en) | 2019-03-22 | 2019-03-22 | Text TF-IDF characteristic reconstruction method combining emotion intensity |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110096597A true CN110096597A (en) | 2019-08-06 |
CN110096597B CN110096597B (en) | 2023-07-04 |
Family
ID=67444027
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910224082.0A Active CN110096597B (en) | 2019-03-22 | 2019-03-22 | Text TF-IDF characteristic reconstruction method combining emotion intensity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110096597B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111859944A (en) * | 2020-07-17 | 2020-10-30 | 维沃移动通信有限公司 | Information display method and device and electronic equipment |
CN113111653A (en) * | 2021-04-07 | 2021-07-13 | 同济大学 | Text feature construction method based on Word2Vec and syntactic dependency tree |
CN113204624A (en) * | 2021-06-07 | 2021-08-03 | 吉林大学 | Multi-feature fusion text emotion analysis model and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105138506A (en) * | 2015-07-09 | 2015-12-09 | 天云融创数据科技(北京)有限公司 | Financial text sentiment analysis method |
CN105528410A (en) * | 2015-12-05 | 2016-04-27 | 浙江大学 | Method for concluding and classifying online comments of hospital |
CN106296288A (en) * | 2016-08-10 | 2017-01-04 | 常州大学 | A kind of commodity method of evaluating performance under assessing network text guiding |
CN106528533A (en) * | 2016-11-08 | 2017-03-22 | 浙江理工大学 | Dynamic sentiment word and special adjunct word-based text sentiment analysis method |
CN108197104A (en) * | 2017-12-27 | 2018-06-22 | 浙江力石科技股份有限公司 | Text analyzing method, apparatus and cloud platform |
-
2019
- 2019-03-22 CN CN201910224082.0A patent/CN110096597B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105138506A (en) * | 2015-07-09 | 2015-12-09 | 天云融创数据科技(北京)有限公司 | Financial text sentiment analysis method |
CN105528410A (en) * | 2015-12-05 | 2016-04-27 | 浙江大学 | Method for concluding and classifying online comments of hospital |
CN106296288A (en) * | 2016-08-10 | 2017-01-04 | 常州大学 | A kind of commodity method of evaluating performance under assessing network text guiding |
CN106528533A (en) * | 2016-11-08 | 2017-03-22 | 浙江理工大学 | Dynamic sentiment word and special adjunct word-based text sentiment analysis method |
CN108197104A (en) * | 2017-12-27 | 2018-06-22 | 浙江力石科技股份有限公司 | Text analyzing method, apparatus and cloud platform |
Non-Patent Citations (2)
Title |
---|
王日宏等: "改进的基于语义理解的文本情感分类方法研究", 《计算机科学》 * |
陈国兰: "基于情感词典与语义规则的微博情感分析", 《情报探索》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111859944A (en) * | 2020-07-17 | 2020-10-30 | 维沃移动通信有限公司 | Information display method and device and electronic equipment |
CN111859944B (en) * | 2020-07-17 | 2022-12-13 | 维沃移动通信有限公司 | Information display method and device and electronic equipment |
CN113111653A (en) * | 2021-04-07 | 2021-07-13 | 同济大学 | Text feature construction method based on Word2Vec and syntactic dependency tree |
CN113204624A (en) * | 2021-06-07 | 2021-08-03 | 吉林大学 | Multi-feature fusion text emotion analysis model and device |
CN113204624B (en) * | 2021-06-07 | 2022-06-14 | 吉林大学 | Multi-feature fusion text emotion analysis model and device |
Also Published As
Publication number | Publication date |
---|---|
CN110096597B (en) | 2023-07-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8341520B2 (en) | Method and system for spell checking | |
Jamal et al. | Poetry classification using support vector machines | |
CN110096597A (en) | A kind of text TF-IDF feature reconstruction method of combination emotional intensity | |
CN104317965B (en) | Sentiment dictionary construction method based on language material | |
Salloum et al. | ADAM: Analyzer for dialectal Arabic morphology | |
CN106096664A (en) | A kind of sentiment analysis method based on social network data | |
KR20200083111A (en) | System for correcting language and method thereof, and method for learning language correction model | |
CN106446147A (en) | Emotion analysis method based on structuring features | |
Yoshino et al. | Spoken dialogue system based on information extraction using similarity of predicate argument structures | |
CN110134934A (en) | Text emotion analysis method and device | |
CN107797986A (en) | A kind of mixing language material segmenting method based on LSTM CNN | |
Torres-Moreno | Three Statistical Summarizers at CLEF-INEX 2013 Tweet Contextualization Track. | |
Cao et al. | Sentiment analysis based on expanded aspect and polarity-ambiguous word lexicon | |
Farhan et al. | Sentiment-specific word embedding for Indonesian sentiment analysis | |
Hao et al. | SCESS: a WFSA-based automated simplified chinese essay scoring system with incremental latent semantic analysis | |
CN110765762B (en) | System and method for extracting optimal theme of online comment text under big data background | |
Nawar | CUFE@ QALB-2015 shared task: Arabic error correction system | |
Moctezuma et al. | INGEOTEC solution for Task 4 in TASS'18 competition. | |
Pang | Chinese readability analysis and its applications on the internet | |
Wawer | Towards domain-independent opinion target extraction | |
CN111259661A (en) | New emotion word extraction method based on commodity comments | |
Sharounthan et al. | Retracted: Singlish Sentiment Analysis Based Rating For Public Transportation | |
Abdelrazaq et al. | A machine learning system for distinguishing nominal and verbal Arabic sentences | |
CN106294312A (en) | Information processing method and information processing system | |
Penuela | Deception Detection in Arabic Tweets and News. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |