CN110096597A - A kind of text TF-IDF feature reconstruction method of combination emotional intensity - Google Patents

A kind of text TF-IDF feature reconstruction method of combination emotional intensity Download PDF

Info

Publication number
CN110096597A
CN110096597A CN201910224082.0A CN201910224082A CN110096597A CN 110096597 A CN110096597 A CN 110096597A CN 201910224082 A CN201910224082 A CN 201910224082A CN 110096597 A CN110096597 A CN 110096597A
Authority
CN
China
Prior art keywords
word
text
idf
emotional intensity
dictionary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910224082.0A
Other languages
Chinese (zh)
Other versions
CN110096597B (en
Inventor
邓修齐
康琦
张量
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN201910224082.0A priority Critical patent/CN110096597B/en
Publication of CN110096597A publication Critical patent/CN110096597A/en
Application granted granted Critical
Publication of CN110096597B publication Critical patent/CN110096597B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/374Thesaurus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The present invention relates to a kind of text TF-IDF feature reconstruction methods of combination emotional intensity, expression and user name are extracted and divided by canonical matching process, positional relationship according to intensity dictionary and negative word, degree adverb, repetitor is modified word intensity, new word is replaced by the near synonym replacement method based on Word2Vec, so that the TF-IDF feature vector to text is reconstructed.Compared with prior art, the present invention considers situations such as negative word, degree adverb, repetitor, is modified to the TF-IDF feature of word, retains the information such as intensity, the position of word;With the new word on the ripe word replacement test collection occurred in training set, enhance Generalization Capability;It can not need to be segmented manually directly using former sentence as input when using.

Description

A kind of text TF-IDF feature reconstruction method of combination emotional intensity
Technical field
The invention belongs to the classification fields in natural language processing, are related to a kind of text classification preprocess method, especially It is related to a kind of text TF-IDF feature reconstruction method of combination emotional intensity.
Background technique
Instantly the reverse text frequency (Term of word frequency-is commonly used in natural language processing and machine learning field Frequency-Inverse Document Frequency, abbreviation TF-IDF) construction obtain the feature vector of text.With microblogging Netspeak for representative includes that special languages ingredient, the existing methods such as expression, user name are not handled them, is caused Information is obscured;The elements such as negative word, degree adverb, dittograph in Chinese text will have a direct impact on the emotional intensity of text With polarity, the feature vector that existing method obtains can not retain these information, cause the misalignment of information;In test set and practical fortune Some new words not in training set in, existing method can give up them, cause the loss of information.
Summary of the invention
It is an object of the present invention to overcome the above-mentioned drawbacks of the prior art and provide a kind of text emotions to analyze Preprocess method is extracted and is divided to expression and user name by canonical matching process, according to intensity dictionary and negative word, Degree adverb, repetitor positional relationship word intensity is modified, pass through the near synonym replacement method based on Word2Vec New word is replaced, so that the TF-IDF feature vector to text is reconstructed.
The purpose of the present invention can be achieved through the following technical solutions:
A kind of text TF-IDF feature reconstruction method of combination emotional intensity, comprising the following steps:
S1 is constructed and is deactivated dictionary, degree dictionary and negative dictionary, and the word in the degree dictionary is strong with emotion The degree adverb of grade is spent, the word in the negative dictionary is negative word;
S2 obtains text to be analyzed, is multiple clauses by text segmentation using punctuation mark as separation;
S3 traverses each word in clause and records the number and position that they occur, leaves out stop words therein, right The subsequent word of degree adverb carries out emotional intensity amendment, and the overturning of feeling polarities is carried out to the subsequent word of negative word;
S4 creates a blank dictionary to each section of text to be analyzed, is indexed with word, strong with the emotion of word Degree, quantity make key assignments, traverse each word, if current term is off word, degree adverb or negative word, skip the word Any operation is not done;If not including current term in existing dictionary, which is deposited into dictionary;If deposited in dictionary In current term, then the emotional intensity and quantity of corresponding word in dictionary are updated;
S5 extracts the TF-IDF characteristic value of text, respectively by the TF-IDF value of each word and emotion corresponding in dictionary Intensity is multiplied, the characteristic value after being reconstructed:
TF-IDFnew,w=TF-IDFw×degw
Wherein, TF-IDFnew,wFor the TF-IDF characteristic value of the word w after reconstruct, TF-IDFwFor the original TF- of word w IDF characteristic value, degwFor the emotional intensity of word w.
The deactivated dictionary includes English character, number and mathematical character.
The text to be analyzed is the microblogging text comprising user name and expression, in the step S2, is used first Canonical matching method in text user name (text after@symbol) and expression (text in [] symbol) matched and mentioned It takes, they is distinguished with plain text, the influence to avoid the word in them with Sentiment orientation to the emotion of whole text.
In the step S2, the separation of each clause is punctuation mark.
The punctuation mark does not include pause mark, quotation marks, dash, single quotation marks and colon.
In the step S3, the emotional intensity calculation formula of word are as follows:
Wherein, degwFor the emotional intensity of word w, m degree adverb, n negative word are had before the word, pow is The intensity value of degree adverb.
In the step S3, if there is the case where negative word is before degree adverb, the emotion of corresponding word Intensity amendment are as follows:
This method first constructs a list, all words occurred when for storing trained, in step S4 in initialization In, word and list are compared, when word is the new word being not present in list, using near synonym replacement method, with column The highest word of similarity replaces the new word in table.
In the step S4, after words all in text are all stored in dictionary, adding for the emotional intensity of word is also carried out Power operation, specifically includes the following steps:
1) text is obtained divided by word frequency of occurrence Dict [w] [count] with total emotional intensity Dict [w] [deg] of word The average emotional intensity of word w in this
2) total emotional intensity Dict [w] [deg] of word w is motivated: ifThen Dict [w] [deg] It is updated to Dict [w] [deg]+degw+ M, ifThen Dict [w] [deg] is updated toWherein, M is excitation value.The real feelings of calculating word w after abbreviation are strong Degree:
Wherein, M is excitation value.
Compared with prior art, the invention has the following advantages that
(1) degree dictionary is constructed, amendment is weighted by word intensity is modified by degree word in sentence, continuously occurs multiple Correction effect can be superimposed when degree word;The modified word intensity of negative word in sentence is carried out polarity reversion by building negative dictionary, Inversion effect can be superimposed when continuously there are multiple negative words.
(2) former sentence is segmented by morphemes such as punctuation mark, user name, non-morpheme words, negative word and degree adverb are repaired Positive interaction only in section effectively, to the emotion ambiguity for avoiding long sentence, complicated sentence pattern from being easy to generate, the counter productives such as mix.
(3) user name, the expression in text are matched and is extracted using canonical matching method, by it and plain text word Language is distinguished, and is avoided information and is obscured.
(3) with the new word on the ripe word replacement test collection occurred in training set, enhance Generalization Capability.
(4) it can not need to be segmented manually directly using former sentence as input when using.
(5) situations such as considering negative word, degree adverb, repetitor, is modified the TF-IDF feature of word, retains word The information such as intensity, the position of language.
Detailed description of the invention
Fig. 1 is the present embodiment method flow schematic diagram;
Fig. 2 (a), 2 (b) are an example, wherein Fig. 2 (a) and Fig. 2 (b) is respectively to use conventional method and the method for the present invention TF-IDF feature is extracted, and result is subjected to visual histogram;
Fig. 3 (a), 3 (b) are another example, wherein Fig. 3 (a) and Fig. 3 (b) is respectively to use conventional method and side of the present invention Method extracts TF-IDF feature, and result is carried out visual histogram;
Fig. 4 is the hyperplane schematic diagram of conventional method;
Fig. 5 is the hyperplane schematic diagram after the present embodiment feature reconstruction.
It by training set, 500 texts of 10000 texts is test training that Fig. 6, which is conventional method and the present embodiment method, The performance comparison of the svm classifier model got.
Specific embodiment
The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.The present embodiment is with technical solution of the present invention Premised on implemented, the detailed implementation method and specific operation process are given, but protection scope of the present invention is not limited to Following embodiments.
Embodiment
A kind of text TF-IDF feature reconstruction method of combination emotional intensity, build first deactivated dictionary, degree dictionary and It negate dictionary.Stop words mainly includes English character, number, mathematical character, punctuation mark and the extra-high Chinese word character of frequency of use Deng, such as " you ", " I ", " because ", " and " etc.;Degree dictionary includes a series of for modifying adjective and the strong journey of adverbial word The adverbial word of degree emotion intensity corresponding with them selects word " the degree grade that more comprehensively Hownet was issued in 2007 Other word dictionary " be used as degree dictionary, and by degree adverb therein be divided into " extremely ", " super ", " very ", " compared with ", " slightly ", Six ranks such as " deficient " respectively correspond 1.7,1.5,1.3,1.1,0.8 and 0.5 etc. six intensity;Negate dictionary include " no ", The common negative word such as "No", " non-", " not ".As Fig. 1 be it is shown, this method includes following below scheme:
1. canonical matches
User name, the expression in text are matched and extracted using canonical matching method first, by it and plain text Word is distinguished.As soon as being often matched to an expression or user name, position of their content with them in former sentence is recorded.
User name canonical matching expression: "@{ 1 } w { 1,30 } $ |@{ 1 } w { 1,30 } s ", meaning :@be followed by length be 1 To 30 character string (Chinese character or English), then connect user name full stop (end of the sentence symbol or space).
Microblogging expression regular expression: " [w { 1,5 }] ", meaning: the word for being 1-5 by the length that bracket " [] " surrounds Symbol string.
2. punctuate segmentation (rough segmentation)
Using punctuation mark as separation, former sentence is divided into many clauses (rough segmentation), wherein pause mark, quotation marks, dash, list The punctuation marks such as quotation marks, colon are not belonging to above-mentioned decollator scope because these punctuates will not interrupt sentence semantic logic and The continuity of emotion.The modification of degree adverb, negative word in each clause only comes into force in clause, does not influence other sons Sentence.By the step for, the rule of feature reconstruction below can be simplified to a certain extent, it is superfluous to avoid mixing for information It is remaining.
3. information excavating
On the basis of previous step rough segmentation, (subdivision, such as the jieba participle in python are segmented to each clause Library), it traverses each word in clause and records the number and position that they occur.Then leave out wherein all stop words, The amendment of emotion degree is carried out to the subsequent word of degree adverb, the overturning of feeling polarities is carried out to the subsequent word of negative word.
If being modified word is w, there are m degree adverb, n negative word before it, the intensity value of degree adverb is pow. " emotion degree " attribute calculation formula of so word w is as follows:
If the emotional intensity value of the word is all degree of the front there are multiple degree adverbs before modificand The product of adverbial word intensity.That is, the effect of degree adverb can be superimposed;If there are multiple negatives before modificand Word, then odd number negative word Overlay is equal to a negative word according to " two negatives make a positive " principle, even number negative word is folded Effect is added to be equal to no negative word.
4. data correction
This step is modified the intensity value of word mainly for two kinds of special circumstances in Chinese.On the one hand, when one A word had not only been denied word modification but also had been modified by degree adverb, needed to consider the relative positional relationship of degree adverb and negative word, Such as: " I am not especially happy ", expression is positive emotion, and negative word " no " has modified " special ", so that the reinforcement of its script Affectivity has become weakening, although whole emotion be still it is positive " happy ", its degree ratio " happy " is weaker.Therefore work as There is the case where negative word is before degree adverb, negative inverse is sought the emotional intensity acquired in previous step, in word quilt While being modified to positive emotion, its emotional intensity is also weakened.That is:
5. intensity weighted
Each section of text to be analyzed can all create a blank dictionary, be indexed with word, and intensity, the quantity of word are made Key assignments.
All words are traversed again, if current vocabulary is off word, degree adverb or negative word, is skipped them and are not appointed What is operated;If not including current vocabulary in existing dictionary, it is just deposited into dictionary;If having existed for working as in dictionary Preceding vocabulary, then just update dictionary in equivalent " intensity " (deg) and " quantity " (count) two information, to it While intensity, quantity are updated, give its Intensity attribute one additional excitation value M, to increase the emotion of dittograph Intensity:
Dict [w] [deg] +=degw
Dict [w] [count] +=1
After all words of text have all been stored in dictionary, so that it may carry out the weighting of word intensity.The total of word is used first Intensity Dict [w] [deg] seeks to obtain the mean intensity of word w in a document divided by word frequency of occurrence Dict [w] [count]:
Since the emotional intensity of identical word in the same text may be positive and may also be negative, so being weighted average Intensity afterwardsBoth may be positive may also be negative, and it is strong give it while intensity, the quantity to it are updated Attribute one additional excitation value M is spent, repeats intensity of the word in sentence to increase.In view of word identical in text Emotional intensity, which may be positive, to be negative, if the word being negative to an emotional intensity adds a positive energize, can weaken it Original emotion even results in the reversion of its polarity.So first calculating word mean intensityFurther according toPositive and negative add The scheme of corresponding positive/negative excitation.
If it is N, i.e., last Dict [w] [count]=N, the then real feelings of word w that total degree, which occurs, in word w in text Intensity:
With the increase of N,1 can be leveled off to, so excitationBeing can be with the increase of word frequency of occurrence N And it approaches to saturation.That is some word occurs repeatedly in the text, its emotional intensity can be strengthened, but this reinforces There is the upper limit, the effect reinforced every time can be more and more weaker with the increase of frequency of occurrence.Motivate the specific value of M by using Person's decision, an adjustable parameter as this method.
6. new word is replaced
In this method initialization, a list can be constructed, all words occurred when for storing trained.Test and When practice, word and the word in list can be compared, when occurring that new word is not present in list, this method is used The near synonym replacement method (such as synonyms packet in python) of Word2Vec, compares two words in term vector space Cosine similarity, new word is replaced with the highest ripe word of similarity in list, to enhance the Generalization Capability of model.
7. feature reconstruction
The TF-IDF of text is extracted by existing method (such as sklearn.TfidfVectorizer packet in python) The TF-IDF value of each word is multiplied by feature with emotional intensity corresponding in dictionary respectively, the characteristic value after being reconstructed:
TF-IDFnew,w=TF-IDFw×Degw
TF-IDFnew,w: the TF-IDF characteristic value of the word w after reconstruct;
TF-IDFw: pass through the original TF-IDF characteristic value of the TfidfVectorizer word w acquired;
Degw: the emotional intensity value of word w.
In this way, just being reconstructed to the TF-IDF in feature vector, new feature vector has been obtained.
Fig. 2 and Fig. 3 is a specific example, and it is special to extract TF-IDF with conventional method and new method proposed by the present invention respectively Sign, and result is visualized.By comparing it can be found that conventional method is lost the location information and degree letter of sentence Breath causes two adversative text features the same;TF-IDF feature reconstruction method proposed by the present invention remains text Location information and degree information, and they are embodied out in TF-IDF feature.
Fig. 4 and Fig. 5 is the TF-IDF feature reconstruction method effect diagram that this method proposes, Fig. 4 is the super flat of conventional method Face schematic diagram, the schematic diagram after the feature reconstruction of the position Fig. 5.It can be seen that the degree adverb in conventional method loses repairing for they Decorations effect, and individual features are retained after reconstruct, improve the accuracy of classification.
It is the property of svm classifier model that test set training obtains that Fig. 6, which is by training set, 500 texts of 10000 texts, It can comparison.By comparing can see, in the various aspects such as accuracy rate, recall rate, precision ratio, F1 scoring, TF- proposed by the present invention IDF feature reconstruction method will be better than conventional method.
The following are 4 examples comparatives to illustrate:
Example 1:@also like today miuky today you are very good-looking [happy] [happy]
It is compared with the traditional method, user name "@also likes miuky today " and expression " [happy] " correctly divide It cuts.
Example 2: I does not like driving
Method Feature vector
Conventional method Drive: 0.89 likes: 0.45
Method proposed by the present invention Like: -0.46 drives: -0.89
It is compared with the traditional method, the feature vector after reconstruct remains the information of negative word " no ".
Example 3: I does not like driving very much
Method Feature vector
Method proposed by the present invention Like: -0.79 drives: -1.51
Feature vector after reconstruct remains the strength information of degree adverb " very ".
Example 4: today, my heart very grief and indignation (explanation: comprising " sad " word in the model that training obtains, and did not included " grief and indignation ")
Method Feature vector
Conventional method Today: 0.86 today: 0.52
Method proposed by the present invention Heart: 0.52 grief and indignation: 1.11
It is compared with the traditional method, new word " grief and indignation " can be identified, word information retains more complete.

Claims (10)

1. a kind of text TF-IDF feature reconstruction method of combination emotional intensity, which comprises the following steps:
S1 is constructed and is deactivated dictionary, degree dictionary and negative dictionary, and the word in the degree dictionary is with emotional intensity etc. The degree adverb of grade, the word in the described negative dictionary is negative word;
S2 obtains text to be analyzed, is multiple clauses by text segmentation using punctuation mark as separation;
S3 traverses each word in clause and records the number and position that they occur, leaves out stop words therein, to degree The subsequent word of adverbial word carries out emotional intensity amendment, and the overturning of feeling polarities is carried out to the subsequent word of negative word;
S4 creates a blank dictionary to each section of text to be analyzed, is indexed with word, with the emotional intensity of word, number Amount makees key assignments, traverses each word, if current term is off word, degree adverb or negative word, skips the word and do not do Any operation;If not including current term in existing dictionary, which is deposited into dictionary;If had existed in dictionary Current term then updates the emotional intensity and quantity of corresponding word in dictionary;
S5 extracts the TF-IDF characteristic value of text, respectively by the TF-IDF value of each word and emotional intensity corresponding in dictionary It is multiplied, the characteristic value after being reconstructed:
TF-IDFnew,w=TF-IDFw×degw
Wherein, TF-IDFnew,wFor the TF-IDF characteristic value of the word w after reconstruct, TF-IDFwIt is special for the original TF-IDF of word w Value indicative, degwFor the emotional intensity of word w.
2. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 1, which is characterized in that institute The deactivated dictionary stated includes English character, number and mathematical character.
3. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 1, which is characterized in that institute The text to be analyzed stated is the microblogging text comprising user name and expression, in the step S2, uses canonical matching method first To in text user name and expression matched and extracted, they are distinguished with plain text, to avoid in them band feelings Influence of the word of sense tendency to the emotion of whole text.
4. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 3, which is characterized in that text The text after the entitled symbol of user in this, expression are the text in [] symbol.
5. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 1, which is characterized in that institute In the step S2 stated, the separation of each clause is punctuation mark.
6. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 5, which is characterized in that institute The punctuation mark stated does not include pause mark, quotation marks, dash, single quotation marks and colon.
7. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 1, which is characterized in that institute In the step S3 stated, the emotional intensity calculation formula of word are as follows:
Wherein, degwFor the emotional intensity of word w, m degree adverb, n negative word are had before the word, pow is degree pair The intensity value of word.
8. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 7, which is characterized in that institute In the step S3 stated, if there is the case where negative word is before degree adverb, the emotional intensity of corresponding word is corrected are as follows:
9. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 1, which is characterized in that should Method first constructs a list, all words occurred when for storing trained, in step s 4, by word in initialization It is compared with list, when word is the new word being not present in list, using near synonym replacement method, with similarity in list Highest word replaces the new word.
10. a kind of text TF-IDF feature reconstruction method of combination emotional intensity according to claim 1, which is characterized in that In the step S4, after words all in text are all stored in dictionary, the weighting operations of the emotional intensity of word, tool are also carried out Body the following steps are included:
1) it is obtained in text with total emotional intensity Dict [w] [deg] of word divided by word frequency of occurrence Dict [w] [count] The average emotional intensity of word w
2) the real feelings intensity of word w is calculated:
CN201910224082.0A 2019-03-22 2019-03-22 Text TF-IDF characteristic reconstruction method combining emotion intensity Active CN110096597B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910224082.0A CN110096597B (en) 2019-03-22 2019-03-22 Text TF-IDF characteristic reconstruction method combining emotion intensity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910224082.0A CN110096597B (en) 2019-03-22 2019-03-22 Text TF-IDF characteristic reconstruction method combining emotion intensity

Publications (2)

Publication Number Publication Date
CN110096597A true CN110096597A (en) 2019-08-06
CN110096597B CN110096597B (en) 2023-07-04

Family

ID=67444027

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910224082.0A Active CN110096597B (en) 2019-03-22 2019-03-22 Text TF-IDF characteristic reconstruction method combining emotion intensity

Country Status (1)

Country Link
CN (1) CN110096597B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859944A (en) * 2020-07-17 2020-10-30 维沃移动通信有限公司 Information display method and device and electronic equipment
CN113111653A (en) * 2021-04-07 2021-07-13 同济大学 Text feature construction method based on Word2Vec and syntactic dependency tree
CN113204624A (en) * 2021-06-07 2021-08-03 吉林大学 Multi-feature fusion text emotion analysis model and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138506A (en) * 2015-07-09 2015-12-09 天云融创数据科技(北京)有限公司 Financial text sentiment analysis method
CN105528410A (en) * 2015-12-05 2016-04-27 浙江大学 Method for concluding and classifying online comments of hospital
CN106296288A (en) * 2016-08-10 2017-01-04 常州大学 A kind of commodity method of evaluating performance under assessing network text guiding
CN106528533A (en) * 2016-11-08 2017-03-22 浙江理工大学 Dynamic sentiment word and special adjunct word-based text sentiment analysis method
CN108197104A (en) * 2017-12-27 2018-06-22 浙江力石科技股份有限公司 Text analyzing method, apparatus and cloud platform

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138506A (en) * 2015-07-09 2015-12-09 天云融创数据科技(北京)有限公司 Financial text sentiment analysis method
CN105528410A (en) * 2015-12-05 2016-04-27 浙江大学 Method for concluding and classifying online comments of hospital
CN106296288A (en) * 2016-08-10 2017-01-04 常州大学 A kind of commodity method of evaluating performance under assessing network text guiding
CN106528533A (en) * 2016-11-08 2017-03-22 浙江理工大学 Dynamic sentiment word and special adjunct word-based text sentiment analysis method
CN108197104A (en) * 2017-12-27 2018-06-22 浙江力石科技股份有限公司 Text analyzing method, apparatus and cloud platform

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王日宏等: "改进的基于语义理解的文本情感分类方法研究", 《计算机科学》 *
陈国兰: "基于情感词典与语义规则的微博情感分析", 《情报探索》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859944A (en) * 2020-07-17 2020-10-30 维沃移动通信有限公司 Information display method and device and electronic equipment
CN111859944B (en) * 2020-07-17 2022-12-13 维沃移动通信有限公司 Information display method and device and electronic equipment
CN113111653A (en) * 2021-04-07 2021-07-13 同济大学 Text feature construction method based on Word2Vec and syntactic dependency tree
CN113204624A (en) * 2021-06-07 2021-08-03 吉林大学 Multi-feature fusion text emotion analysis model and device
CN113204624B (en) * 2021-06-07 2022-06-14 吉林大学 Multi-feature fusion text emotion analysis model and device

Also Published As

Publication number Publication date
CN110096597B (en) 2023-07-04

Similar Documents

Publication Publication Date Title
US8341520B2 (en) Method and system for spell checking
Jamal et al. Poetry classification using support vector machines
CN110096597A (en) A kind of text TF-IDF feature reconstruction method of combination emotional intensity
CN104317965B (en) Sentiment dictionary construction method based on language material
Salloum et al. ADAM: Analyzer for dialectal Arabic morphology
CN106096664A (en) A kind of sentiment analysis method based on social network data
KR20200083111A (en) System for correcting language and method thereof, and method for learning language correction model
CN106446147A (en) Emotion analysis method based on structuring features
Yoshino et al. Spoken dialogue system based on information extraction using similarity of predicate argument structures
CN110134934A (en) Text emotion analysis method and device
CN107797986A (en) A kind of mixing language material segmenting method based on LSTM CNN
Torres-Moreno Three Statistical Summarizers at CLEF-INEX 2013 Tweet Contextualization Track.
Cao et al. Sentiment analysis based on expanded aspect and polarity-ambiguous word lexicon
Farhan et al. Sentiment-specific word embedding for Indonesian sentiment analysis
Hao et al. SCESS: a WFSA-based automated simplified chinese essay scoring system with incremental latent semantic analysis
CN110765762B (en) System and method for extracting optimal theme of online comment text under big data background
Nawar CUFE@ QALB-2015 shared task: Arabic error correction system
Moctezuma et al. INGEOTEC solution for Task 4 in TASS'18 competition.
Pang Chinese readability analysis and its applications on the internet
Wawer Towards domain-independent opinion target extraction
CN111259661A (en) New emotion word extraction method based on commodity comments
Sharounthan et al. Retracted: Singlish Sentiment Analysis Based Rating For Public Transportation
Abdelrazaq et al. A machine learning system for distinguishing nominal and verbal Arabic sentences
CN106294312A (en) Information processing method and information processing system
Penuela Deception Detection in Arabic Tweets and News.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant