CN110598219A - Emotion analysis method for broad-bean-net movie comment - Google Patents

Emotion analysis method for broad-bean-net movie comment Download PDF

Info

Publication number
CN110598219A
CN110598219A CN201911009781.XA CN201911009781A CN110598219A CN 110598219 A CN110598219 A CN 110598219A CN 201911009781 A CN201911009781 A CN 201911009781A CN 110598219 A CN110598219 A CN 110598219A
Authority
CN
China
Prior art keywords
emotion
word
words
negative
dictionary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911009781.XA
Other languages
Chinese (zh)
Inventor
吴杰胜
陆奎
董涛
刘舜
苏树智
吴佳昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University of Science and Technology
Original Assignee
Anhui University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University of Science and Technology filed Critical Anhui University of Science and Technology
Priority to CN201911009781.XA priority Critical patent/CN110598219A/en
Publication of CN110598219A publication Critical patent/CN110598219A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/374Thesaurus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Library & Information Science (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to an emotion analysis method for movie comments of the bean networks, which is mainly used for carrying out emotion analysis on the movie comments of Chinese on the bean networks, and comprises the steps of firstly carrying out data crawling operation on the movie comments on the bean networks, and then carrying out preprocessing operation on the data, wherein the preprocessing operation comprises the deletion of stop words, participles and part-of-speech labels; secondly, constructing four types of dictionaries required by the emotion analysis of the film reviews, wherein the four types of dictionaries are a basic emotion dictionary, a negative word dictionary, a degree adverb dictionary and an emotion dictionary in the film review field respectively; secondly, performing emotion calculation on the movie reviews by using a designed emotion calculation method to judge the emotion polarity; then judging the emotion polarity of the comment by using the weak marking information scored by the user; if the comment emotion polarity obtained through emotion calculation is consistent with the comment emotion polarity judged by the weak annotation information, the emotion polarity of the movie comment can be obtained, and if the comment emotion polarity obtained through emotion calculation is inconsistent with the comment emotion polarity judged by the weak annotation information, the emotion polarity of the movie comment is judged according to emotion calculation.

Description

Emotion analysis method for broad-bean-net movie comment
Technical Field
The invention belongs to the technical field of text sentiment analysis in natural language processing, and particularly relates to a sentiment analysis method for a broad bean movie comment.
Background
The bean networks are used as a common movie social media comment platform and bear massive information. After each movie comes out, numerous net friends can make comments on the broad bean networks, the massive subjective comment text data contains abundant emotional information, and how to analyze the emotional polarity of the emotional information is very meaningful.
In the prior art, two methods are mainly adopted for emotion analysis of a text, one method is a method based on machine learning, but a proper feature training model needs to be selected, so that emotion polarity judgment of the text is realized; and the other method is to calculate the emotion weight of the text by designing a reasonable emotion calculation algorithm by adopting an emotion dictionary-based method, so that the emotion polarity judgment of the text is realized.
For example, the prior patent document (application number: 201611062208.1) discloses a sentiment analysis method for civil aviation security and public opinion, which mainly utilizes a sentiment dictionary and rule method to carry out sentiment analysis on microblog texts in the field of civil aviation so as to filter out microblogs threatening the safety of civil aviation; the patent document (application number: 201610475678.4) provides an emotion analysis method based on social network data, which mainly utilizes a machine learning method and adopts a linear support vector machine model to train emotion classification features extracted from a training set so as to determine and obtain an emotion analysis classifier; and determining to analyze the emotion classification features in the prediction set by adopting an emotion analysis classifier, and determining to predict the emotion tendency of the target data issued by the user on the social network platform.
However, both of the above methods have disadvantages, and firstly, the method based on machine learning requires a large amount of manual annotation data sets, and is not suitable for processing fine-grained text such as movie reviews; although the method based on the emotion dictionary is suitable for processing fine-grained text, the included emotion words are limited. Therefore, in order to fully obtain the emotional information of the user comments, the emotion analysis method based on the emotion dictionary and the weak annotation is adopted to carry out emotion analysis on the broad bean movie comments, and the movie comment emotion is better divided into positive direction and negative direction. The application of the invention has very important significance in the field of predicting the film box rooms.
Disclosure of Invention
The invention aims to provide an emotion analysis method for a broad bean movie comment; the method is specially used for carrying out sentiment analysis on fine-grained Chinese comments on the bean cotyledon net, and sentiment classification is carried out on the comments by utilizing a sentiment dictionary and weak annotation information on the bean cotyledon net. The method has the advantages that the manual annotation of a data set is not needed, the comment emotion weight is accurately calculated, the comment emotion weight and the weak annotation information are jointly used for judging the comment emotion polarity, and the accuracy of film comment emotion analysis is improved.
The invention adopts the following technical scheme for realizing the purpose:
an emotion analysis method for a bean-net movie review is characterized by comprising the following steps:
(1) firstly, performing data crawling operation on movie reviews on a broad bean network, and then performing preprocessing operation on the data, wherein the preprocessing operation comprises deleting stop words, participles and part-of-speech labels;
(2) constructing four types of dictionaries required by the emotion analysis of the film reviews, wherein the four types of dictionaries are a basic emotion dictionary, a negative word dictionary, a degree adverb dictionary and an emotion dictionary in the film review field respectively;
(3) scanning and matching words obtained by segmenting a single movie comment with an emotion dictionary according to the emotion dictionary constructed in the step (2) to obtain a plurality of emotion words; when the emotional words are matched, further scanning and matching negative words, degree adverb or non-definite word dictionaries and degree adverb dictionaries of the modified emotional words; calculating the emotion weight of the emotion words, the weight of the negative words and the weight multiple of the degree adverbs according to the four types of dictionaries, and then carrying out emotion calculation on the emotion weight of the emotion words, the weight of the negative words and the weight multiple of the degree adverbs to obtain the emotion weight of the single movie comment; if the emotion weight is greater than or equal to 0, the emotion polarity of the movie comment is positive; if the emotion weight is less than 0, the emotion polarity of the movie comment is negative;
(4) because the obtained movie comment data contains the user's bean scores, which are called as emotion weak annotation information, and the scores have 5 grades, the emotion polarity of the movie comment with the score larger than or equal to 3 is selected as the positive direction, and the emotion polarity of the movie comment with the score smaller than 3 is selected as the negative direction;
(5) obtaining the emotional polarity of the movie comment through emotional calculation in the step (3) and the emotional polarity of the movie comment obtained through the broad bean scoring condition of the user in the step (4), so as to further determine the emotional polarity of the movie comment; if the emotion polarities obtained by the two methods are positive, determining that the emotion polarity of the movie comment is positive; if the emotion polarities obtained by the two methods are negative, determining that the emotion polarity of the movie comment is negative; and if the emotion polarities obtained by the two methods are opposite, determining the emotion polarity of the movie comment as the emotion polarity obtained by emotion calculation in the step (3).
Preferably, the emotion analysis method for the movie reviews of the broad bean movies, provided by the invention, comprises the following four types of emotion dictionary construction methods:
(1) the basic emotion dictionary is taken from a Chinese emotion dictionary library of the university of major continuous engineering, and the dictionary library divides emotion words into five levels of emotion weight and three types of words; the invention uses numeral 1 to represent positive word, numeral 2 to represent negative word, 0 to represent neutral word and its emotion weight is 0, and five levels of emotion weight are 9, 7, 5, 3, 1 respectively;
(2) the negative word dictionary comprises two parts, namely a negative word and a question reversing word, wherein when the negative word and the question reversing word modify the emotion word, the emotion polarity of the word can be changed, but the language atmosphere of the question reversing word is stronger, while the emotion polarity of the word cannot be changed by the double negative word, but the language atmosphere is stronger, and 25 negative words are obtained by manual screening to form a negative word dictionary, wherein the weight of the negative word is-1, the weight of the question reversing word is-2, and the weight of the double negative word is 1;
(3) the degree adverb dictionary is from a known net dictionary library, the words are divided into 6 grades together, the grades are respectively super, most, very, more, slightly and under, certain weights are respectively given to the 6 grades, the emotional intensity of the modified emotional words is expanded by certain times, and the weight times are respectively 3, 2.5, 2, 1.5, 1 and 0.5;
(4) the emotion dictionary structure in the field of movie reviews is mainly characterized in that because a basic emotion dictionary is incomplete and the summarization of emotion words is limited, unique emotion new words on some movie reviews need to be identified, and an emotion dictionary is constructed for the new words;
the method for extracting the new emotion words comprises the steps of scanning and matching words obtained after word segmentation in the movie comment with an existing basic emotion dictionary, and determining the words as new words if the words do not appear in the basic emotion dictionary;
the method for determining the new emotion words comprises the steps of calculating semantic similarity between the new words and seed words by utilizing a PMI algorithm, and finally calculating the emotion polarity of unknown new words;
PMI is also called point mutual information, and mainly can calculate similarity between words; unknown word w1And seed word w2The similarity calculation formula is as follows:
wherein P (w)1,w2) Denotes w1,w2Probability of co-occurrence, p (w)1)、p(w2) Respectively represents w1,w2Probability of occurrence alone;
the formula can only calculate the semantic similarity of a pair of words, and has no convincing power in emotion analysis, so on the basis of considering the semantic similarity, when counting the word frequency of the movie comment emotion words, 30 seed words with high positive and negative emotion polarities are selected according to the result to form a positive emotion word set WpAnd negative emotion word set WNFor investigating semantic similarity between multiple words, and simultaneously for formulas(1) And improving to obtain a new formula for judging the emotion polarity of the new word w:
if the value of the formula (2) is more than or equal to 0, the emotion polarity of the new word w is positive; less than 0, the emotion polarity of the new word w is negative; the new emotion words are divided into four levels, and the new emotion words of each level are endowed with certain emotion weights which are respectively 2, 1, -1 and-2.
Preferably, the emotion analysis method for the movie reviews of the bean networks, provided by the invention, comprises the following steps of calculating emotion weights of the movie reviews:
the letter D is used for single movie comment, and each emotional word in the comment is WiIs shown, seniExpressing the emotion weight obtained by matching the emotion words with the emotion dictionary;
(1) word emotion value E (W)i) The calculation formula of (2) is as follows:
E(Wi)=Ni×Ai×seni (3)
in equation (3): n is a radical ofiAn emotion weight indicating a negative word or a double negative word or a counter word, AiWeight multiple, sen, representing degree adverbiRepresenting the emotion weight, W, obtained by matching the emotion word with the emotion dictionaryiRepresenting emotional words, i represents the number of the emotional words;
(2) if negative words appear before the emotional words, the number of the negative words needs to be considered; if the number is odd, the emotion polarity of the emotion words is opposite to that of the original emotion words; if the number is even, the double negative words are obtained, and the emotion polarity of the emotion words is unchanged; the specific calculation formula is as follows:
Ni=(-1)k (4)
wherein k is the number of negative words; if the answer word appears in front of the emotional word, the emotion polarity of the emotional word is changed, the strength is higher, and the value of the answer word can be obtained as a weight value of-2 according to a negative word dictionary;
(3) since the relative order relationship between the negative word and the degree adverb also has an influence on the emotion weight of the emotion word, such as "too look poor" and "not look poor", it is obvious that the emotion of the second sentence is weaker than that of the first sentence, and therefore, when the negative word precedes the degree adverb, the value of formula (3) is multiplied by 0.5; multiplying the value of equation (3) by-1 when the degree adverb precedes the negation word; the specific calculation formula is as follows:
in formula (5), loc (A) represents the position of the degree adverb, and loc (N) represents the position of the negation word;
(4) therefore, the emotion weight calculation formula of the single movie comment is obtained as follows:
the expression (6) is used to show that when the value of the expression (6) is greater than 0, the emotion polarity of the movie comment is positive; when the value of equation (6) is less than 0, it indicates that the emotion polarity of the movie comment is negative.
Has the advantages that: the invention provides an emotion analysis method for a broad bean film comment, which has the following advantages compared with other invention methods:
(1) the emotion polarity of the broad bean film comment is judged by combining the emotion dictionary with the weak label information, wherein the emotion dictionary in the field of the film comment is expanded, the coverage of emotion words is increased, the limitation of the original emotion dictionary is overcome, and the accuracy of emotion analysis is improved.
(2) The method accurately calculates the emotion weight of the movie comment, and has important significance for predicting the movie box office and timely mastering the emotion tendency of the user.
(3) The method is different from a machine learning method, does not need to adopt large-scale manual labeling of a data set, and is suitable for fine-grained text such as movie reviews.
Drawings
FIG. 1 is a schematic flow chart of a method for analyzing emotion of a commentary of a Doujin movie provided by the present invention;
FIG. 2 is a flow chart of the present invention for calculating the emotion of a movie comment using an emotion dictionary;
FIG. 3 is a diagram of an emotion dictionary module constructed in accordance with the present invention;
FIG. 4 is a graph of experimental comparison results of the invention validated on the warwolf 2 movie reviews dataset.
Detailed Description
The technical solution of the present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
The process of the sentiment analysis method for the bean-web movie review provided by the invention is shown in FIG. 1, and comprises the following steps:
step (1): firstly, crawling movie comment data on a bean-paste network, then carrying out preprocessing operation on the data, wherein the preprocessing operation comprises deleting stop words, participles and part-of-speech labels, and obtaining scoring data of a user;
for example: obtaining a conclusion that a lady friend with a comment of' movie lead angle really performs in a 500 m match of a short-track speed skating man! "the stop word of the comment is deleted firstly," then the Chinese academy ICTCCLAS software is used for word segmentation and part of speech tagging, and finally the comment is changed into { movie, leading role, girl, on, short track speed skating, man, 500, meter, match, true, performance, very excellent }.
Step (2): and judging the emotion polarity of the comment by using the emotion dictionary and judging the emotion polarity of the comment by using the weak marking information scored by the user.
And (3): if the emotion polarities obtained in the step (2) are both in the positive direction, determining that the emotion polarity of the movie comment is in the positive direction; if the emotion polarities obtained in the step (2) and the two are negative, determining that the emotion polarity of the movie comment is negative; and (3) if the emotion polarities obtained in the step (2) are opposite, determining the emotion polarity of the movie comment as the emotion polarity obtained by emotion calculation by using an emotion dictionary.
The flow chart of the invention for calculating the emotion of the movie comment by using the emotion dictionary is described with reference to fig. 2 and 3, and the steps are as follows:
step (1): firstly, taking the film comment after word segmentation as an analysis object;
step (2): matching each word with the constructed emotion dictionary, judging whether the word is in the emotion dictionary, if so, executing the next step (3), and if not, determining a new word by utilizing the improved semantic similarity algorithm of the invention and adding the new word into the domain emotion dictionary;
and (3): and finally, combining the basic emotion dictionary, the degree adverb dictionary, the negative word dictionary and the field emotion dictionary in the figure 3 to carry out emotion calculation on the comments according to the method provided by the invention to obtain emotion weight, wherein the emotion calculation steps are as follows:
the letter D is used for single movie comment, and each emotional word in the comment is WiIs shown, seniExpressing the emotion weight obtained by matching the emotion words with the emotion dictionary;
a) word emotion value E (W)i) The calculation formula of (2) is as follows:
E(Wi)=Ni×Ai×seni (3)
in equation (3): n is a radical ofiAn emotion weight indicating a negative word or a double negative word or a counter word, AiWeight multiple, sen, representing degree adverbiRepresenting the emotion weight, W, obtained by matching the emotion word with the emotion dictionaryiRepresenting emotional words, i represents the number of the emotional words;
b) if negative words appear before the emotional words, the number of the negative words needs to be considered; if the number is odd, the emotion polarity of the emotion words is opposite to that of the original emotion words; if the number is even, the double negative words are obtained, and the emotion polarity of the emotion words is unchanged; the specific calculation formula is as follows:
Ni=(-1)k (4)
wherein k is the number of negative words; if the answer word appears in front of the emotional word, the emotion polarity of the emotional word is changed, the strength is higher, and the value of the answer word can be obtained as a weight value of-2 according to a negative word dictionary;
c) since the relative order relationship between the negative word and the degree adverb also has an influence on the emotion weight of the emotion word, such as "too look poor" and "not look poor", it is obvious that the emotion of the second sentence is weaker than that of the first sentence, and therefore, when the negative word precedes the degree adverb, the value of formula (3) is multiplied by 0.5; multiplying the value of equation (3) by-1 when the degree adverb precedes the negation word; the specific calculation formula is as follows:
in formula (5), loc (A) represents the position of the degree adverb, and loc (N) represents the position of the negation word;
d) therefore, the emotion weight calculation formula of the single movie comment is obtained as follows:
the expression (6) is used to show that when the value of the expression (6) is greater than 0, the emotion polarity of the movie comment is positive; when the value of equation (6) is less than 0, it indicates that the emotion polarity of the movie comment is negative.
The basic emotion dictionary described in FIG. 3 is taken from the Chinese emotion dictionary library of the university of major connecting physics, which divides the emotion words into five levels of emotion weights and three types of words; the invention uses the number 1 to represent the positive word, the number 2 to represent the negative word, 0 to represent the neutral word and the emotion weight of the neutral word is 0, the emotion weight of five levels is 9, 7, 5, 3, 1 respectively, which is mainly used to match the emotion words in the basic emotion dictionary in the comment; examples are shown in Table 1.
TABLE 1 basic Emotion dictionary example
Emotional words Part of speech category Weight value Polarity
Desperation Adjectives 9 2
Racing snow Noun (name) 5 1
Happy Adjectives 5 1
Number of people Verb and its usage 0 0
The negative word dictionary shown in fig. 3 includes two parts, that is, a negative word and a counter word, and when the negative word and the counter word modify an emotion word, the emotion polarity of the word is changed, but the mood of the counter word is stronger, and the emotion polarity of the word is not changed by the double negative word, but the mood is stronger, and 25 negative words are obtained by manual screening to form a negative word dictionary, wherein the weight of the negative word is-1, the weight of the counter word is-2, and the weight of the double negative word is 1; examples are shown in Table 2.
TABLE 2 negative word dictionary and double negative word dictionary examples
Word type Word and phrase Weight value
Negative word None, no. -1
Question-back word Difficult to turn and make it possible to turn the switch … -2
Double negative word Is not, is not at all … 1
The degree adverb dictionary shown in fig. 3 is from a dictionary library of the web, the words are totally divided into 6 grades, the grades are respectively super, very, comparatively, slightly and under, certain weights are respectively given to the 6 grades, the emotional intensity of the modified emotional words is expanded by certain times, the weight times are respectively 3, 2.5, 2, 1.5, 1 and 0.5, and the degree adverb dictionary is mainly used for matching degree adverbs in comments; examples are shown in Table 3.
TABLE 3 exemplary degree adverb dictionary
Grade Adverb Multiple of weight Number of
Super-super Over, over and one core. 3 30
Most preferably Baibai, Zhi … 2.5 69
Very much He, He and Tai … 2 42
Compared with Then, big, even … 1.5 37
Tip of a bit Slight, slight … 1 29
Is owed to Not so weak, not so much … 0.5 12
The domain emotion dictionary shown in fig. 3 is a set formed by emotion new words calculated and judged by a semantic similarity algorithm, the weight of the emotion new words can be calculated, and then the emotion new words form a domain emotion dictionary; examples are shown in Table 4.
TABLE 4 field Emotion dictionary example
New words of microblog Weight value Number of
Zhenxiang, skr and koi … 2 18
Guan Xuan, Buddha series and confirmation of eye spirit … 1 40
Arranging, cooling and big pig hoof … -1 65
Gou die, nima, middle-aged greasy man … -2 41
The experimental feasibility verification of the method of fig. 1, 2 and 3 is described below in connection with fig. 4, and is described in detail below:
firstly, a comment data set about a movie warwolf 2 on a bean-shaped net is crawled, then, a method based on a basic emotion dictionary and a method based on the method are respectively utilized to carry out emotion analysis experiments on the comment data set so as to test the performance of the method, and meanwhile, Accuracy (Accuracy) is used as an evaluation standard. The result analysis of fig. 4 shows that the method of the present invention has a higher accuracy for the sentiment analysis of the movie reviews on the broad bean networks than the method based on the basic sentiment dictionary alone. Therefore, the method provided by the invention has a great effect on the sentiment analysis of the bean-web movie review, and has wide application and prospect in the field.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (3)

1. An emotion analysis method for a bean-net movie review is characterized by comprising the following steps:
(1) firstly, performing data crawling operation on movie reviews on a broad bean network, and then performing preprocessing operation on the data, wherein the preprocessing operation comprises deleting stop words, participles and part-of-speech labels;
(2) constructing four types of dictionaries required by the emotion analysis of the film reviews, wherein the four types of dictionaries are a basic emotion dictionary, a negative word dictionary, a degree adverb dictionary and an emotion dictionary in the film review field respectively;
(3) scanning and matching words obtained by segmenting a single movie comment with an emotion dictionary according to the emotion dictionary constructed in the step (2) to obtain a plurality of emotion words; when the emotional words are matched, further scanning and matching negative words, degree adverb or non-definite word dictionaries and degree adverb dictionaries of the modified emotional words; calculating the emotion weight of the emotion words, the weight of the negative words and the weight multiple of the degree adverbs according to the four types of dictionaries, and then carrying out emotion calculation on the emotion weight of the emotion words, the weight of the negative words and the weight multiple of the degree adverbs to obtain the emotion weight of the single movie comment; if the emotion weight is greater than or equal to 0, the emotion polarity of the movie comment is positive; if the emotion weight is less than 0, the emotion polarity of the movie comment is negative;
(4) because the obtained movie comment data contains the user's bean scores, which are called as emotion weak annotation information, and the scores have 5 grades, the emotion polarity of the movie comment with the score larger than or equal to 3 is selected as the positive direction, and the emotion polarity of the movie comment with the score smaller than 3 is selected as the negative direction;
(5) obtaining the emotional polarity of the movie comment through emotional calculation in the step (3) and the emotional polarity of the movie comment obtained through the broad bean scoring condition of the user in the step (4), so as to further determine the emotional polarity of the movie comment; if the emotion polarities obtained by the two methods are positive, determining that the emotion polarity of the movie comment is positive; if the emotion polarities obtained by the two methods are negative, determining that the emotion polarity of the movie comment is negative; and if the emotion polarities obtained by the two methods are opposite, determining the emotion polarity of the movie comment as the emotion polarity obtained by emotion calculation in the step (3).
2. The emotion analysis method for the commentary on the Doujin movie, as claimed in claim 1, wherein in step (2), the four types of emotion dictionary construction method comprises the following steps:
(1) the basic emotion dictionary is taken from a Chinese emotion dictionary library of the university of major continuous engineering, and the dictionary library divides emotion words into five levels of emotion weight and three types of words; the invention uses numeral 1 to represent positive word, numeral 2 to represent negative word, 0 to represent neutral word and its emotion weight is 0, and five levels of emotion weight are 9, 7, 5, 3, 1 respectively;
(2) the negative word dictionary comprises two parts, namely a negative word and a question reversing word, wherein when the negative word and the question reversing word modify the emotion word, the emotion polarity of the word can be changed, but the language atmosphere of the question reversing word is stronger, while the emotion polarity of the word cannot be changed by the double negative word, but the language atmosphere is stronger, and 25 negative words are obtained by manual screening to form a negative word dictionary, wherein the weight of the negative word is-1, the weight of the question reversing word is-2, and the weight of the double negative word is 1;
(3) the degree adverb dictionary is from a known net dictionary library, the words are divided into 6 grades together, the grades are respectively super, most, very, more, slightly and under, certain weights are respectively given to the 6 grades, the emotional intensity of the modified emotional words is expanded by certain times, and the weight times are respectively 3, 2.5, 2, 1.5, 1 and 0.5;
(4) the emotion dictionary structure in the field of movie reviews is mainly characterized in that because a basic emotion dictionary is incomplete and the summarization of emotion words is limited, unique emotion new words on some movie reviews need to be identified, and an emotion dictionary is constructed for the new words;
the method for extracting the new emotion words comprises the steps of scanning and matching words obtained after word segmentation in the movie comment with an existing basic emotion dictionary, and determining the words as new words if the words do not appear in the basic emotion dictionary;
the method for determining the new emotion words comprises the steps of calculating semantic similarity between the new words and seed words by utilizing a PMI algorithm, and finally calculating the emotion polarity of unknown new words;
PMI is also called point mutual information, and mainly can calculate similarity between words; unknown word w1And seed word w2The similarity calculation formula is as follows:
wherein P (w)1,w2) Denotes w1,w2Probability of co-occurrence, p (w)1)、p(w2) Respectively represents w1,w2Probability of occurrence alone;
the formula can only calculate the semantic similarity of a pair of words, and has no convincing power in emotion analysis, so on the basis of considering the semantic similarity, when counting the word frequency of the movie comment emotion words, 30 seed words with high positive and negative emotion polarities are selected according to the result to form a positive emotion word set WpAnd negative emotion word set WNThe method is used for investigating semantic similarity among multiple words, and meanwhile, the formula (1) is improved to obtain a new formula for judging the emotion polarity of a new word w:
if the value of the formula (2) is more than or equal to 0, the emotion polarity of the new word w is positive; less than 0, the emotion polarity of the new word w is negative; the new emotion words are divided into four levels, and the new emotion words of each level are endowed with certain emotion weights which are respectively 2, 1, -1 and-2.
3. The emotion analysis method for the string of beans film reviews according to claim 1, wherein in step (3), the emotion weight calculation step for a single movie review is as follows:
the letter D is used for single movie comment, and each emotional word in the comment is WiIs shown, seniExpressing the emotion weight obtained by matching the emotion words with the emotion dictionary;
(1) word emotion value E (W)i) The calculation formula of (2) is as follows:
E(Wi)=Ni×Ai×seni (3)
in equation (3): n is a radical ofiMeans a negative word orThe emotional weight of the dual negative word and the counter word, AiWeight multiple, sen, representing degree adverbiRepresenting the emotion weight, W, obtained by matching the emotion word with the emotion dictionaryiRepresenting emotional words, i represents the number of the emotional words;
(2) if negative words appear before the emotional words, the number of the negative words needs to be considered; if the number is odd, the emotion polarity of the emotion words is opposite to that of the original emotion words; if the number is even, the double negative words are obtained, and the emotion polarity of the emotion words is unchanged; the specific calculation formula is as follows:
Ni=(-1)k (4)
wherein k is the number of negative words; if the answer word appears in front of the emotional word, the emotion polarity of the emotional word is changed, the strength is higher, and the value of the answer word can be obtained as a weight value of-2 according to a negative word dictionary;
(3) since the relative order relationship between the negative word and the degree adverb also has an influence on the emotion weight of the emotion word, such as "too look poor" and "not look poor", it is obvious that the emotion of the second sentence is weaker than that of the first sentence, and therefore, when the negative word precedes the degree adverb, the value of formula (3) is multiplied by 0.5; multiplying the value of equation (3) by-1 when the degree adverb precedes the negation word; the specific calculation formula is as follows:
in formula (5), loc (A) represents the position of the degree adverb, and loc (N) represents the position of the negation word;
(4) therefore, the emotion weight calculation formula of the single movie comment is obtained as follows:
the expression (6) is used to show that when the value of the expression (6) is greater than 0, the emotion polarity of the movie comment is positive; when the value of equation (6) is less than 0, it indicates that the emotion polarity of the movie comment is negative.
CN201911009781.XA 2019-10-23 2019-10-23 Emotion analysis method for broad-bean-net movie comment Pending CN110598219A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911009781.XA CN110598219A (en) 2019-10-23 2019-10-23 Emotion analysis method for broad-bean-net movie comment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911009781.XA CN110598219A (en) 2019-10-23 2019-10-23 Emotion analysis method for broad-bean-net movie comment

Publications (1)

Publication Number Publication Date
CN110598219A true CN110598219A (en) 2019-12-20

Family

ID=68850112

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911009781.XA Pending CN110598219A (en) 2019-10-23 2019-10-23 Emotion analysis method for broad-bean-net movie comment

Country Status (1)

Country Link
CN (1) CN110598219A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111104515A (en) * 2019-12-24 2020-05-05 山东众志电子有限公司 Emotional word text information classification method
CN111310455A (en) * 2020-02-11 2020-06-19 安徽理工大学 New emotion word polarity calculation method for online shopping comments
CN112000804A (en) * 2020-08-18 2020-11-27 安徽理工大学 Microblog hot topic user group emotion tendentiousness analysis method
CN112364646A (en) * 2020-11-18 2021-02-12 安徽财经大学 Sentence comment emotion polarity analysis method considering modifiers
CN112417892A (en) * 2020-12-08 2021-02-26 珠海横琴博易数据技术有限公司 Semantic emotion recognition method
CN112668330A (en) * 2020-12-31 2021-04-16 北京大米科技有限公司 Data processing method and device, readable storage medium and electronic equipment
CN112926307A (en) * 2021-03-19 2021-06-08 闽江学院 Dependency relationship-based evaluation object emotion analysis method and storage medium
CN113254647A (en) * 2021-06-11 2021-08-13 大唐融合通信股份有限公司 Course quality analysis method, device and system
CN116805147A (en) * 2023-02-27 2023-09-26 杭州城市大脑有限公司 Text labeling method and device applied to urban brain natural language processing

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017051425A1 (en) * 2015-09-23 2017-03-30 Devanathan Giridhari A computer-implemented method and system for analyzing and evaluating user reviews
CN109684647A (en) * 2019-02-19 2019-04-26 东北林业大学 Film comment sentiment analysis method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017051425A1 (en) * 2015-09-23 2017-03-30 Devanathan Giridhari A computer-implemented method and system for analyzing and evaluating user reviews
CN109684647A (en) * 2019-02-19 2019-04-26 东北林业大学 Film comment sentiment analysis method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴杰胜等: "基于多部情感词典与SVM的电影评论情感分析", 《阜阳师范学院学报(自然科学版)》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111104515A (en) * 2019-12-24 2020-05-05 山东众志电子有限公司 Emotional word text information classification method
CN111310455A (en) * 2020-02-11 2020-06-19 安徽理工大学 New emotion word polarity calculation method for online shopping comments
CN112000804A (en) * 2020-08-18 2020-11-27 安徽理工大学 Microblog hot topic user group emotion tendentiousness analysis method
CN112000804B (en) * 2020-08-18 2022-08-02 安徽理工大学 Microblog hot topic user group emotion tendentiousness analysis method
CN112364646A (en) * 2020-11-18 2021-02-12 安徽财经大学 Sentence comment emotion polarity analysis method considering modifiers
CN112417892A (en) * 2020-12-08 2021-02-26 珠海横琴博易数据技术有限公司 Semantic emotion recognition method
CN112668330A (en) * 2020-12-31 2021-04-16 北京大米科技有限公司 Data processing method and device, readable storage medium and electronic equipment
CN112668330B (en) * 2020-12-31 2024-01-26 北京大米科技有限公司 Data processing method and device, readable storage medium and electronic equipment
CN112926307A (en) * 2021-03-19 2021-06-08 闽江学院 Dependency relationship-based evaluation object emotion analysis method and storage medium
CN113254647A (en) * 2021-06-11 2021-08-13 大唐融合通信股份有限公司 Course quality analysis method, device and system
CN116805147A (en) * 2023-02-27 2023-09-26 杭州城市大脑有限公司 Text labeling method and device applied to urban brain natural language processing
CN116805147B (en) * 2023-02-27 2024-03-22 杭州城市大脑有限公司 Text labeling method and device applied to urban brain natural language processing

Similar Documents

Publication Publication Date Title
CN110598219A (en) Emotion analysis method for broad-bean-net movie comment
Li et al. Sentiment analysis of danmaku videos based on naïve bayes and sentiment dictionary
Sahu et al. Sentiment analysis of movie reviews: A study on feature selection & classification algorithms
CN107609132B (en) Semantic ontology base based Chinese text sentiment analysis method
Li et al. Structure-aware review mining and summarization
CN111797898B (en) Online comment automatic reply method based on deep semantic matching
CN109885670A (en) A kind of interaction attention coding sentiment analysis method towards topic text
Chang et al. Research on detection methods based on Doc2vec abnormal comments
CN106326212A (en) Method for analyzing implicit type discourse relation based on hierarchical depth semantics
CN110390018A (en) A kind of social networks comment generation method based on LSTM
CN103473380B (en) A kind of computer version sensibility classification method
CN108108468A (en) A kind of short text sentiment analysis method and apparatus based on concept and text emotion
CN104199845B (en) Line Evaluation based on agent model discusses sensibility classification method
CN107688870A (en) A kind of the classification factor visual analysis method and device of the deep neural network based on text flow input
CN111626050B (en) Microblog emotion analysis method based on expression dictionary and emotion general knowledge
CN112966526A (en) Automobile online comment emotion analysis method based on emotion word vector
Guo et al. Local government debt risk assessment: A deep learning-based perspective
Mozafari et al. Emotion detection by using similarity techniques
Batra et al. A large-scale tweet dataset for urdu text sentiment analysis
CN115329085A (en) Social robot classification method and system
CN107818173A (en) A kind of false comment filter method of Chinese based on vector space model
CN111985223A (en) Emotion calculation method based on combination of long and short memory networks and emotion dictionaries
Zhao et al. POS-ATAEPE-BiLSTM: an aspect-based sentiment analysis algorithm considering part-of-speech embedding
Fu et al. Sentiment Analysis of Tourist Scenic Spots Internet Comments Based on LSTM
Jayawickrama et al. Facebook for Sentiment Analysis: Baseline Models to Predict Facebook Reactions of Sinhala Posts

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20191220

WD01 Invention patent application deemed withdrawn after publication