CN111414755A - Network emotion analysis method based on fine-grained emotion dictionary - Google Patents

Network emotion analysis method based on fine-grained emotion dictionary Download PDF

Info

Publication number
CN111414755A
CN111414755A CN202010202982.8A CN202010202982A CN111414755A CN 111414755 A CN111414755 A CN 111414755A CN 202010202982 A CN202010202982 A CN 202010202982A CN 111414755 A CN111414755 A CN 111414755A
Authority
CN
China
Prior art keywords
text
emotion
network
fine
outputting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010202982.8A
Other languages
Chinese (zh)
Inventor
杨小兵
陈欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Jiliang University
Original Assignee
China Jiliang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Jiliang University filed Critical China Jiliang University
Priority to CN202010202982.8A priority Critical patent/CN111414755A/en
Publication of CN111414755A publication Critical patent/CN111414755A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention discloses a network emotion analysis method based on a fine-grained emotion dictionary, which comprises the following steps of: the method comprises the steps of obtaining a network text, calculating the network text according to a fine-grained emotion dictionary method, outputting a word vector text, performing semantic obtaining processing on the word vector text, outputting an original text feature set matched with the network text, performing nonlinear transformation on the original text feature set to output an F value, and taking the F value as an index of network emotion analysis. The construction of the fine-grained emotional dictionary enables word information to be more comprehensive, the problem of data scarcity is solved, and the accuracy of prediction is improved.

Description

Network emotion analysis method based on fine-grained emotion dictionary
Technical Field
The invention relates to the field of emotion analysis, in particular to a network emotion analysis method based on a fine-grained emotion dictionary.
Background
With the rapid development of social networks and electronic commerce, social networks and shopping platforms such as microblog, Twitter, WeChat, QQ, Face-book, Taobao, Jingdong and the like bring great influence to the life of people, and more users like to publish their own opinions on social media instead of just browsing and receiving information. In China, microblogging has become a core platform for many young people to share and acquire information. The information includes personal moods such as happiness, anger, sadness, music and the like, and the analysis of the moods in the information can obtain the internal activity of the user and analyze the character characteristics of the user. Analyzing people's attitudes towards public events and social phenomena may allow for better detection and control of event progress. Therefore, the emotion analysis method has important significance in emotion analysis of texts in social media such as microblogs.
In the current system process, the existing problem data scarcity, namely the emotion training corpus and emotion dictionary resources are scarcity, in order to solve the problem, a method for fusing a fine-grained emotion dictionary is provided, the method constructs a fine-grained microblog emotion dictionary containing emotion information, emotion information and part-of-speech information according to the existing emotion data, so that word information is more comprehensive, the word information and vectors obtained by large-scale text pre-training are fused together to form emotion word vectors, the problem of data scarcity is solved, and the accuracy of prediction is improved.
Disclosure of Invention
The invention provides a network emotion analysis method based on a fine-grained emotion dictionary, and aims to solve the problem of data scarcity in the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention discloses a network emotion analysis method based on a fine-grained emotion dictionary, which comprises the following steps of:
acquiring a web text, calculating the web text according to a fine-grained emotion dictionary method, and outputting a word vector text;
semantic acquisition processing is carried out on the word vector text, and an original text characteristic set matched with the network text is output;
and carrying out nonlinear transformation on the original text feature set to output an F value, and taking the F value as an index of network emotion analysis.
The method comprises the steps of obtaining a network text, calculating the network text according to a fine-grained emotion dictionary method, outputting a word vector text, performing semantic obtaining processing on the word vector text, outputting an original text feature set matched with the network text, performing nonlinear transformation on the original text feature set to output an F value, and taking the F value as an index of network emotion analysis. The construction of the fine-grained emotional dictionary enables word information to be more comprehensive, the problem of data scarcity is solved, and the accuracy of prediction is improved.
Preferably, the obtaining of the web text, the calculating of the web text according to the fine-grained emotion dictionary method, and the outputting of the word vector text include:
preprocessing the acquired network text by the aid of the crust segmentation words, and outputting the network preprocessed text;
splicing word vectors of words in the network preprocessed text to obtain a network preprocessed word vector text, wherein the formula is as follows:
Figure BDA0002419995720000021
where Vi ∈ Vd × n indicates that ti corresponds to an element in the dictionary,
Figure BDA0002419995720000022
representing a row vector splicing operation;
utilizing formula for vector text of network preprocessed word and fine-grained emotion dictionary
Figure BDA0002419995720000023
Fused output word vector text X, where VPRepresenting part of speech information, VMRepresenting emotional information, VERepresenting an emotion.
Preferably, the semantic acquisition processing is performed on the word vector text, and an original text feature set matched with the web text is output, including:
inputting the word vector text into an Attention layer, and outputting an Attention sequence after the Attention calculation;
inputting the Attention sequence into a convolution layer, performing convolution operation, and outputting a characteristic matrix C;
and inputting the feature matrix C into a pooling layer for sampling operation, and outputting the original text feature set.
Preferably, the method for outputting the F value by performing nonlinear transformation on the original text feature set and using the F value as an index of network emotion analysis includes:
inputting the original text feature set into a multilayer perceptron, outputting emotion tag score vectors, performing Softmax calculation on the emotion tag score vectors, and outputting relative probabilities of the emotion tag score vectors;
and correspondingly setting the network emotion analysis method parameters according to the relative probability of the emotion label score vector, inputting the network emotion analysis method parameters into an algorithm for calculating an F value, and taking the output F value as an index of network emotion analysis.
A network emotion analysis device based on a fine-grained emotion dictionary comprises:
the acquisition module is used for acquiring the web text, calculating the web text according to a fine-grained emotion dictionary method and outputting a word vector text;
the matching module is used for performing semantic acquisition processing on the word vector text and outputting an original text characteristic set matched with the network text;
and the output module is used for carrying out nonlinear transformation on the original text feature set to output an F value, and the F value is used as an index of network emotion analysis.
Preferably, the acquiring module includes:
the preprocessing unit is used for preprocessing the acquired network text through the crust segmentation and outputting the network preprocessed text;
the combination splicing unit is used for splicing word vectors of words in the network preprocessed text to obtain a network preprocessed word vector text, and the formula is as follows:
Figure BDA0002419995720000041
where Vi ∈ Vd × n indicates that ti corresponds to an element in the dictionary,
Figure BDA0002419995720000042
representing a row vector splicing operation;
fusion sheetThe element utilizes the formula to the network preprocessed word vector text and the fine-grained emotion dictionary
Figure BDA0002419995720000043
Fused output word vector text X, where VPRepresenting part of speech information, VMRepresenting emotional information, VERepresenting an emotion.
Preferably, the matching module includes:
the Attention layer unit is used for inputting the word vector text into an Attention layer and outputting an Attention sequence after the Attention calculation;
the convolutional layer unit is used for inputting the Attention sequence into a convolutional layer and carrying out convolution operation to output a characteristic matrix C;
and the pooling layer unit is used for inputting the feature matrix C into a pooling layer for sampling operation and outputting the original text feature set.
Preferably, the output module includes:
the Softmax calculation unit is used for inputting the original text feature set into the multilayer perceptron, outputting emotion tag score vectors, performing Softmax calculation on the emotion tag score vectors and outputting relative probabilities of the emotion tag score vectors;
and the F value calculating unit is used for correspondingly setting the network emotion analysis method parameters according to the relative probability of the emotion label score vector, inputting the network emotion analysis method parameters into an algorithm for calculating the F value, and outputting the F value as an index of network emotion analysis.
An electronic device comprising a memory and a processor, the memory storing one or more computer instructions, wherein the one or more computer instructions are executed by the processor to implement a fine-grained emotion dictionary based network emotion analysis method as recited in any of the above.
A computer readable storage medium storing a computer program which, when executed by a computer, implements a fine-grained sentiment dictionary based network sentiment analysis method as claimed in any one of the above.
The invention has the following beneficial effects:
the method comprises the steps of obtaining a network text, calculating the network text according to a fine-grained emotion dictionary method, outputting a word vector text, performing semantic obtaining processing on the word vector text, outputting an original text feature set matched with the network text, performing nonlinear transformation on the original text feature set to output an F value, and taking the F value as an index of network emotion analysis. The construction of the fine-grained emotional dictionary enables word information to be more comprehensive, the problem of data scarcity is solved, and the accuracy of prediction is improved.
Drawings
FIG. 1 is a first flowchart of a network emotion analysis method based on a fine-grained emotion dictionary according to an embodiment of the present invention;
FIG. 2 is a second flowchart of a network emotion analysis method based on a fine-grained emotion dictionary according to an embodiment of the present invention;
FIG. 3 is a third flowchart of a network emotion analysis method based on a fine-grained emotion dictionary according to an embodiment of the present invention;
FIG. 4 is a fourth flowchart of a network emotion analysis method based on a fine-grained emotion dictionary according to an embodiment of the present invention;
fig. 5 is a flowchart of a specific implementation of a network emotion analysis method based on a fine-grained emotion dictionary according to an embodiment of the present invention.
FIG. 6 is a schematic diagram of a network emotion analysis device based on a fine-grained emotion dictionary according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of an obtaining module of a network emotion analysis device based on a fine-grained emotion dictionary according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of a matching module of a network emotion analysis device based on a fine-grained emotion dictionary according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of an output module of a network emotion analysis device based on a fine-grained emotion dictionary according to an embodiment of the present invention;
FIG. 10 is a flowchart of an embodiment of the present invention for implementing a fine-grained sentiment dictionary-based network sentiment analysis apparatus;
fig. 11 is a schematic diagram of an electronic device for implementing a fine-grained emotion dictionary-based network emotion analysis method according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Before the technical solution of the present invention is introduced, a scenario to which the technical solution of the present invention may be applicable is exemplarily described.
Example 1
As shown in fig. 1, a network emotion analysis method based on a fine-grained emotion dictionary includes the following steps:
s110, obtaining a web text, calculating the web text according to a fine-grained emotion dictionary method, and outputting a word vector text;
s120, performing semantic acquisition processing on the word vector text, and outputting an original text feature set matched with the web text;
s130, carrying out nonlinear transformation on the original text feature set to output an F value, and taking the F value as an index of network emotion analysis.
According to the embodiment 1, a web text is obtained, the web text is calculated according to a fine-grained emotion dictionary method, a word vector text is output, the word vector text is subjected to semantic acquisition processing, an original text feature set matched with the web text is output, the original text feature set is subjected to nonlinear transformation, and an F value is output and serves as an index of network emotion analysis. The network emotion analysis method of the fine-grained emotion dictionary constructs a fine-grained microblog emotion dictionary containing emotion information, emotion information and part-of-speech information according to existing emotion data, enables word information to be more comprehensive, combines the word information with vectors obtained by large-scale text pre-training to form emotion word vectors, solves the problem of data scarcity, and improves prediction accuracy.
Example 2
As shown in fig. 2, a network emotion analysis method based on a fine-grained emotion dictionary includes:
s210, obtaining a web text, calculating the web text according to a fine-grained emotion dictionary method, and outputting a word vector text;
s220, preprocessing the acquired web text through the crust segmentation, and outputting a web preprocessed text;
s230, splicing word vectors of words in the network preprocessed text to obtain a network preprocessed word vector text, wherein the formula is as follows:
Figure BDA0002419995720000071
where Vi ∈ Vd × n indicates that ti corresponds to an element in the dictionary,
Figure BDA0002419995720000072
representing a row vector splicing operation;
s240, utilizing formulas for the vector text of the network preprocessed words and the fine-grained emotion dictionary
Figure BDA0002419995720000073
Fused output word vector text X, where VPRepresenting part of speech information, VMRepresenting emotional information, VERepresenting an emotion.
According to the embodiment 2, the fine-grained emotion dictionary method comprises text preprocessing and word vector representation, the acquired web text is preprocessed through the segmentation of the Chinese words, the web preprocessed text is output, and the preprocessing of the web text is divided into two steps of data cleaning and Chinese word segmentation. Splicing word vectors of words in the network preprocessed text to obtain a network preprocessed word vector text, wherein the word vectors are composed of four parts: text vector VTPart of speech vector VPEmotion vector VEEmotion vector VMThe formula used is:
Figure BDA0002419995720000081
wherein, Vi ∈ Vd × n indicates that ti corresponds to elements in a dictionary, ⊕ indicates a line vector splicing operation, the network preprocessed word vector text and the fine-grained emotion dictionary are processed by a formula
Figure BDA0002419995720000082
Fused output word vector text X, where VPIndicating that the part-of-speech information is classified into 7 classes, VMIndicating that emotional information is classified into 7 classes, VERepresenting the sentiment classification into 6 classes. The emotion data constructs a fine-grained microblog emotion dictionary containing emotion information, emotion information and part-of-speech information, so that word information is more comprehensive, and the problem of data scarcity is solved.
Example 3
As shown in fig. 3, a network emotion analysis method based on a fine-grained emotion dictionary includes:
s310, performing semantic acquisition processing on the word vector text, and outputting an original text feature set matched with the web text;
s320, inputting the word vector text into an Attention layer, and outputting an Attention sequence after the Attention calculation;
s330, inputting the Attention sequence into a convolution layer, performing convolution operation, and outputting a characteristic matrix C;
s340, inputting the feature matrix C into a pooling layer for sampling operation, and outputting the original text feature set;
the semantic acquisition layer in the embodiment 3 comprises an Attention layer, a convolution layer and a pooling layer, wherein a word vector text is input into the Attention layer, an Attention sequence is output after the Attention calculation, the Attention calculation is mainly divided into three steps, namely, the similarity between Query and each Key is calculated, weights are acquired, common similarity functions comprise dot products, splicing, a sensing machine and the like, the weights are normalized by using a Softmax function, and finally the weights and corresponding Key Value values are weighted and summed to obtain the final Attention.
And inputting the characteristic matrix C into a pooling layer for sampling operation, outputting the original text characteristic set, and performing local characteristic extraction on an input sequence by the convolution layer through different convolution kernels.
And inputting the feature matrix C into a pooling layer for sampling operation, and outputting the original text feature set, wherein the pooling layer performs downsampling operation on the feature matrix C obtained after convolution, and selects local optimal features from the feature matrix C.
Example 4
As shown in fig. 4, a network emotion analysis method based on a fine-grained emotion dictionary includes:
s410, carrying out nonlinear transformation on the original text feature set to output an F value, and taking the F value as an index of network emotion analysis;
s420, inputting the original text feature set into a multilayer perceptron, outputting emotion tag score vectors, performing Softmax calculation on the emotion tag score vectors, and outputting relative probabilities of the emotion tag score vectors;
s430, correspondingly setting the network emotion analysis method parameters according to the relative probability of the emotion label score vector, inputting the network emotion analysis method parameters into an algorithm for calculating an F value, and taking the output F value as an index of network emotion analysis.
In embodiment 4, the original text feature set is input into a multilayer perceptron to obtain a feature representation of a higher layer, an emotion tag score vector is output, Softmax calculation is performed on the emotion tag score vector, and a relative probability of the emotion tag score vector is output. Correspondingly setting the network emotion analysis method parameters according to the relative probability of the emotion label score vector, inputting the network emotion analysis method parameters into an algorithm for calculating an F value, and taking the output F value as an index of network emotion analysis, wherein the calculation method of the F value comprises the following steps:
Figure BDA0002419995720000101
Figure BDA0002419995720000102
Figure BDA0002419995720000103
where gold is the number of results manually labeled, system _ correct is the number of matches in the submitted result with the manual label, and system _ disposed is the number of submitted results. The accuracy of prediction is improved.
Example 5
As shown in fig. 5, one specific embodiment may be:
s510, obtaining a web text, calculating the web text according to a fine-grained emotion dictionary method, and outputting a word vector text;
the fine-grained emotion dictionary method comprises text preprocessing and word vector representation, wherein the preprocessing of the web text is divided into two steps of data cleaning and Chinese word segmentation. The web texts comprise social networks and shopping platforms such as microblog, Twitter, WeChat, QQ, Face-book, Taobao, Jingdong and the like, and the web texts used in the method are microblog. And calculating the web text by using a fine-grained emotion dictionary method, and outputting a word vector text.
S520, preprocessing the acquired web text through the crust segmentation, and outputting a web preprocessed text;
the method comprises the steps of microblog text preprocessing, data cleaning and Chinese word segmentation, wherein the data cleaning is to delete information irrelevant to emotion analysis, such as links, users, punctuations and the like in microblog texts.
S530, splicing word vectors of words in the network preprocessed text to obtain a networkPreprocessing a word vector text, using the formula:
Figure BDA0002419995720000111
wherein, Vi ∈ Vd × n indicates that ti corresponds to an element in a dictionary, and ⊕ indicates a row vector splicing operation;
the word vector comprises four parts, namely a text vector VT, a part-of-speech vector VP, an emotion vector VE and a mood vector VM., wherein the acquisition of the text vector can be regarded as a dictionary looking-up process, the dimension of a single vector in the dictionary is d, the number of words is N, the dictionary Vd × N is obtained by adopting a word vector training model through large-scale linguistic data, the word vector is a Chinese microblog word vector [16-17] sourced by the Beijing university Chinese information processing institute and the Chinese university DBIIR laboratory, and for a text sequence T ═ T1, T2, … and tn ], the word vectors of words in the text are spliced together to obtain the word vector representation of the whole text sequence.
S540, utilizing formulas for the vector text of the network preprocessed words and the fine-grained emotion dictionary
Figure BDA0002419995720000112
Fused output word vector text X, where VPRepresenting part of speech information, VMRepresenting emotional information, VERepresenting an emotion;
according to the classification standard of 'emotional vocabulary ontology library', parts of speech are divided into 7 classes, namely nouns (Noun), verbs (verbs), adjectives (Adj), adverbs (Adv), network words (Nw), idioms (Idiom), prepositions (Prep). emotions are also divided into 7 classes, namely music (happingess), good (L ike), Anger (Anger), Sadness (Sadness), Fear (Fear), bad (distorst), Surprise (surpe). part of speech information and emotion information are respectively expressed as 7-dimensional vectors VP and VM. by a similar one-hot coding mode, and emotions are divided into 6 classes, namely, positive emotion words, negative emotion words, adverbs, advocates, negatives and neutral words, and are expressed as 6-dimensional vectors VE., so as to reduce sparsity, and VP, VM and VM are initialized to random values between [ -0.1 and 0.1], and finally, the text vectors and the emotion information are combined together to form a merged emotion vector X.
S550, inputting the word vector text into an Attention layer, and outputting an Attention sequence after the Attention calculation;
the Attenttion calculation is mainly divided into three steps, namely, the similarity between Query and each Key is calculated, weights are obtained, common similarity functions comprise dot products, splicing, perceptron and the like, the Softmax function is used for normalizing the weights, and finally the weights and corresponding Key Value values are weighted and summed to obtain the final Attenttion, wherein an Attenttion model proposed by a google machine translation team is the Attentation for similarity calculation by using dot products, and a factor dk plays a regulating role so that the inner products are not too large.
S560, inputting the Attention sequence into a convolution layer, performing convolution operation, and outputting a characteristic matrix C;
the convolutional layer may perform local feature extraction on the input sequence by different convolutional checks. The length h of the convolution kernel can divide the sequence into { X0: h-1, X1: h, …, Xi: i + h-1, …, Xn-h +1: n }, and the convolution characteristics obtained by performing convolution operation on each component are as follows: c ═ C1,c2,…,cn-h+1) Wherein ci is the feature extracted after the convolution operation is performed on the component Xi i + h-1. The ci obtained for each sliding window is calculated as follows: c. Ci=relu(W·Xi:i+h-1+ b), W is the convolution kernel weight, b is the offset.
And S570, inputting the feature matrix C into a pooling layer for sampling operation, and outputting the original text feature set.
The pooling layer performs downsampling operation on the feature matrix C obtained after convolution, selects local optimal features from the downsampling operation, and adopts maximum pooling for sampling, and the obtained features are expressed as: li=max(c1,c2,…,cn-h+1) The resulting features are then combined to yield the vector L: L ═ l1,l2,…,ln). And selecting a multi-channel mode in the convolutional layer, namely selecting a plurality of filters to carry out feature extraction on the sequence, and obtaining the features of the original text sentence through the above operations.
And S580, inputting the original text feature set into the multilayer perceptron, outputting emotion tag score vectors, performing Softmax calculation on the emotion tag score vectors, and outputting relative probabilities of the emotion tag score vectors.
The model herein selects M L P without any hidden layers, performs a non-linear function/transformation on its output vector to obtain a score vector for emotion labels, and then performs a Softmax operation on the emotion score vector.
S590, correspondingly setting the network emotion analysis method parameters according to the relative probability of the emotion label score vector, inputting the network emotion analysis method parameters into an algorithm for calculating an F value, and taking the output F value as an index of network emotion analysis.
The parameter setting can directly influence the model effect, and through continuous parameter adjustment and optimization, the DB-AC model parameters provided by the method are shown as the following table:
Figure BDA0002419995720000131
the F value is calculated as follows:
Figure BDA0002419995720000132
Figure BDA0002419995720000133
Figure BDA0002419995720000134
where gold is the number of results manually labeled, system _ correct is the number of matches in the submitted result with the manual label, and system _ disposed is the number of submitted results. The accuracy of prediction is improved.
Example 6
As shown in fig. 6, a network emotion analysis device based on a fine-grained emotion dictionary includes:
the obtaining module 10 is used for obtaining the web text, calculating the web text according to a fine-grained emotion dictionary method, and outputting a word vector text;
the matching module 20 is used for performing semantic acquisition processing on the word vector text and outputting an original text feature set matched with the network text;
and the output module 30 is used for carrying out nonlinear transformation on the original text feature set to output an F value, and the F value is used as an index of network emotion analysis.
One embodiment of the above apparatus may be: the network emotion analysis method comprises an acquisition module 10, a matching module 20, an output module 30 and an output module, wherein the acquisition module acquires a network text, calculates the network text according to a fine-grained emotion dictionary method, outputs a word vector text, performs semantic acquisition processing on the word vector text, outputs an original text feature set matched with the network text, and finally performs nonlinear transformation on the original text feature set to output an F value which is used as an index of network emotion analysis.
Example 7
As shown in fig. 7, an obtaining module 10 of a network emotion analysis device based on a fine-grained emotion dictionary includes:
the preprocessing unit 12 is used for preprocessing the acquired web text by carrying out crust segmentation and outputting the web preprocessed text;
the combination and splicing unit 14 splices word vectors of words in the network preprocessed text to obtain a network preprocessed word vector text, and the formula is as follows:
Figure BDA0002419995720000141
where Vi ∈ Vd × n indicates that ti corresponds to an element in the dictionary,
Figure BDA0002419995720000142
representing a row vector splicing operation;
a fusion unit 16 for utilizing the formula for the network preprocessed word vector text and the fine-grained emotion dictionary
Figure BDA0002419995720000143
Fused output word vector text X, where VPRepresenting part of speech information, VMRepresenting emotional information, VERepresenting an emotion.
One embodiment of the acquisition module 10 of the above apparatus may be: the preprocessing unit 12 preprocesses the acquired web text by word segmentation, outputs the web preprocessed text, then the combination and concatenation unit 14 concatenates word vectors of words in the web preprocessed text to obtain a web preprocessed word vector text, and the formula is as follows:
Figure BDA0002419995720000144
where Vi ∈ Vd × n indicates that ti corresponds to an element in the dictionary,
Figure BDA0002419995720000145
representing the line vector splicing operation, and finally fusing the unit 16, utilizing the formula for the network preprocessed word vector text and the fine-grained emotion dictionary
Figure BDA0002419995720000146
Fused output word vector text X, where VPRepresenting part of speech information, VMRepresenting emotional information, VERepresenting an emotion.
Example 8
As shown in fig. 8, a matching module 20 of a network emotion analysis device based on a fine-grained emotion dictionary includes:
an Attention layer unit 22, which inputs the word vector text into an Attention layer, and outputs an Attention sequence after the Attention calculation;
a convolutional layer unit 24 for inputting the Attention sequence into a convolutional layer and performing a convolutional operation, and outputting a feature matrix C;
and the pooling layer unit 26 is used for inputting the feature matrix C into a pooling layer for sampling operation and outputting the original text feature set.
One embodiment of the matching module 20 of the above apparatus may be: the convolutional layer unit 24 inputs the Attention sequence into a convolutional layer and performs convolution operation, outputs a feature matrix C, then the convolutional layer unit 24 inputs the Attention sequence into the convolutional layer and performs convolution operation, outputs the feature matrix C, and finally the pooling layer unit 26 inputs the feature matrix C into a pooling layer for sampling operation, and outputs the original text feature set.
Example 9
As shown in fig. 9, an output module 30 of a network emotion analysis device based on a fine-grained emotion dictionary includes:
a Softmax calculation unit 32, which inputs the original text feature set into the multilayer perceptron, outputs an emotion tag score vector, performs Softmax calculation on the emotion tag score vector, and outputs the relative probability of the emotion tag score vector;
and the F value calculating unit 34 is used for correspondingly setting the network emotion analysis method parameters according to the relative probability of the emotion label score vector, inputting the network emotion analysis method parameters into an algorithm for calculating the F value, and outputting the F value as an index of network emotion analysis.
One embodiment of the output module 30 of the above apparatus may be: the Softmax calculation unit 32 inputs the original text feature set into the multilayer perceptron, outputs emotion tag score vectors, performs Softmax calculation on the emotion tag score vectors, and outputs relative probabilities of the emotion tag score vectors, the F value calculation unit 34 performs corresponding setting on the network emotion analysis method parameters according to the relative probabilities of the emotion tag score vectors, inputs the network emotion analysis method parameters into an algorithm for calculating F values, and outputs the F values as indexes of network emotion analysis.
Example 10
As shown in fig. 10, one specific embodiment may be:
s1010, obtaining a web text, calculating the web text according to a fine-grained emotion dictionary method, and outputting a word vector text;
the fine-grained emotion dictionary method comprises text preprocessing and word vector representation, wherein the preprocessing of the web text is divided into two steps of data cleaning and Chinese word segmentation. The web texts comprise social networks and shopping platforms such as microblog, Twitter, WeChat, QQ, Face-book, Taobao, Jingdong and the like, and the web texts used in the method are microblog. And calculating the web text by using a fine-grained emotion dictionary method, and outputting a word vector text.
S1020, preprocessing the acquired web text through the crust segmentation, and outputting a web preprocessed text;
the method comprises the steps of microblog text preprocessing, data cleaning and Chinese word segmentation, wherein the data cleaning is to delete information irrelevant to emotion analysis, such as links, users, punctuations and the like in microblog texts.
S1030, splicing word vectors of words in the network preprocessed text to obtain a network preprocessed word vector text, wherein the formula is as follows:
Figure BDA0002419995720000161
where Vi ∈ Vd × n indicates that ti corresponds to an element in the dictionary,
Figure BDA0002419995720000162
representing a row vector splicing operation;
the word vector comprises four parts, namely a text vector VT, a part-of-speech vector VP, an emotion vector VE and a mood vector VM., wherein the acquisition of the text vector can be regarded as a dictionary looking-up process, the dimension of a single vector in the dictionary is d, the number of words is N, the dictionary Vd × N is obtained by adopting a word vector training model through large-scale linguistic data, the word vector is a Chinese microblog word vector [16-17] sourced by the Beijing university Chinese information processing institute and the Chinese university DBIIR laboratory, and for a text sequence T ═ T1, T2, … and tn ], the word vectors of words in the text are spliced together to obtain the word vector representation of the whole text sequence.
S1040. Utilizing formula for vector text of network preprocessed word and fine-grained emotion dictionary
Figure BDA0002419995720000171
Fused output word vector text X, where VPRepresenting part of speech information, VMRepresenting emotional information, VERepresenting an emotion;
according to the classification standard of 'emotional vocabulary ontology library', parts of speech are divided into 7 classes, namely nouns (Noun), verbs (verbs), adjectives (Adj), adverbs (Adv), network words (Nw), idioms (Idiom), prepositions (Prep). emotions are also divided into 7 classes, namely music (happingess), good (L ike), Anger (Anger), Sadness (Sadness), Fear (Fear), bad (distorst), Surprise (surpe). part of speech information and emotion information are respectively expressed as 7-dimensional vectors VP and VM. by a similar one-hot coding mode, and emotions are divided into 6 classes, namely, positive emotion words, negative emotion words, adverbs, advocates, negatives and neutral words, and are expressed as 6-dimensional vectors VE., so as to reduce sparsity, and VP, VM and VM are initialized to random values between [ -0.1 and 0.1], and finally, the text vectors and the emotion information are combined together to form a merged emotion vector X.
S1050, inputting the word vector text into an Attention layer, and outputting an Attention sequence after the Attention calculation;
the Attenttion calculation is mainly divided into three steps, namely, the similarity between Query and each Key is calculated, weights are obtained, common similarity functions comprise dot products, splicing, perceptron and the like, the Softmax function is used for normalizing the weights, and finally the weights and corresponding Key Value values are weighted and summed to obtain the final Attenttion, wherein an Attenttion model proposed by a google machine translation team is the Attentation for similarity calculation by using dot products, and a factor dk plays a regulating role so that the inner products are not too large.
S1060, inputting the Attention sequence into a convolution layer, performing convolution operation, and outputting a characteristic matrix C;
the convolutional layer may check the sequence of inputs by different convolutionsAnd extracting local features. The length h of the convolution kernel can divide the sequence into { X0: h-1, X1: h, …, Xi: i + h-1, …, Xn-h +1: n }, and the convolution characteristics obtained by performing convolution operation on each component are as follows: c ═ C1,c2,…,cn-h+1) Wherein ci is the feature extracted after the convolution operation is performed on the component Xi i + h-1. The ci obtained for each sliding window is calculated as follows: c. Ci=relu(W·Xi:i+h-1+ b), W is the convolution kernel weight, b is the offset.
And S1070, inputting the feature matrix C into a pooling layer for sampling operation, and outputting the original text feature set.
The pooling layer performs downsampling operation on the feature matrix C obtained after convolution, selects local optimal features from the downsampling operation, and adopts maximum pooling for sampling, and the obtained features are expressed as: li=max(c1,c2,…,cn-h+1) The resulting features are then combined to yield the vector L: L ═ l1,l2,…,ln). And selecting a multi-channel mode in the convolutional layer, namely selecting a plurality of filters to carry out feature extraction on the sequence, and obtaining the features of the original text sentence through the above operations.
And S1080, inputting the original text feature set into the multilayer perceptron, outputting emotion tag score vectors, performing Softmax calculation on the emotion tag score vectors, and outputting the relative probability of the emotion tag score vectors.
The model herein selects M L P without any hidden layers, performs a non-linear function/transformation on its output vector to obtain a score vector for emotion labels, and then performs a Softmax operation on the emotion score vector.
And S1090, correspondingly setting the network emotion analysis method parameters according to the relative probability of the emotion label score vector, inputting the network emotion analysis method parameters into an algorithm for calculating an F value, and using the output F value as an index of network emotion analysis.
The parameter setting can directly influence the model effect, and through continuous parameter adjustment and optimization, the DB-AC model parameters provided by the method are shown as the following table:
Figure BDA0002419995720000191
the F value is calculated as follows:
Figure BDA0002419995720000192
Figure BDA0002419995720000193
Figure BDA0002419995720000194
where gold is the number of results manually labeled, system _ correct is the number of matches in the submitted result with the manual label, and system _ disposed is the number of submitted results. The accuracy of prediction is improved.
Example 11
As shown in fig. 11, an electronic device comprises a memory 1101 and a processor 1102, wherein the memory 1101 is used for storing one or more computer instructions, and the one or more computer instructions are executed by the processor 1102 to implement a fine-grained emotion dictionary based network emotion analysis method as described above.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the electronic device described above may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.
A computer-readable storage medium storing a computer program, which when executed by a computer, implements a fine-grained sentiment dictionary-based network sentiment analysis method as described above.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 1101 and executed by the processor 1102 to implement the present invention. One or more modules/units may be a series of computer program instruction segments capable of performing certain functions, the instruction segments being used to describe the execution of a computer program in a computer device.
The computer device may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The computer device may include, but is not limited to, a memory 1101, a processor 1102. Those skilled in the art will appreciate that the present embodiments are merely exemplary of a computing device and are not intended to limit the computing device, and may include more or fewer components, or some of the components may be combined, or different components, e.g., the computing device may also include input output devices, network access devices, buses, etc.
The processor 1102 may be a Central Processing Unit (CPU), other general purpose processor 1102, a digital signal processor 1102 (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, etc. The general purpose processor 1102 may be a microprocessor 1102 or the processor 1102 may be any conventional processor 1102 or the like.
The storage 1101 may be an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. The memory 1101 may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) card, a flash card (FlashCard), etc. provided on the computer device. Further, the memory 1101 may also include both an internal storage unit and an external storage device of the computer device. The memory 1101 is used to store computer programs and other programs and data required by the computer device. The memory 1101 may also be used to temporarily store data that has been output or is to be output.
The above description is only an embodiment of the present invention, but the technical features of the present invention are not limited thereto, and any changes or modifications within the technical field of the present invention by those skilled in the art are covered by the claims of the present invention.

Claims (10)

1. A network emotion analysis method based on a fine-grained emotion dictionary is characterized by comprising the following steps:
acquiring a web text, calculating the web text according to a fine-grained emotion dictionary method, and outputting a word vector text;
semantic acquisition processing is carried out on the word vector text, and an original text characteristic set matched with the network text is output;
and carrying out nonlinear transformation on the original text feature set to output an F value, and taking the F value as an index of network emotion analysis.
2. The method for analyzing network emotion based on fine-grained emotion dictionary according to claim 1, wherein the steps of obtaining a network text, calculating the network text according to the fine-grained emotion dictionary method, and outputting a word vector text comprise:
preprocessing the acquired network text by the aid of the crust segmentation words, and outputting the network preprocessed text;
splicing word vectors of words in the network preprocessed text to obtain a network preprocessed word vector text, wherein the formula is as follows:
Figure FDA0002419995710000011
where Vi ∈ Vd × n indicates that ti corresponds to an element in the dictionary,
Figure FDA0002419995710000012
representing a row vector splicing operation;
utilizing formula for vector text of network preprocessed word and fine-grained emotion dictionary
Figure FDA0002419995710000013
Fused output word vector text X, where VPRepresenting part of speech information, VMRepresenting emotional information, VERepresenting an emotion.
3. The method for analyzing network emotion based on fine-grained emotion dictionary according to claim 2, wherein the semantic acquisition processing is performed on the word vector text, and an original text feature set matched with the network text is output, and the method comprises:
inputting the word vector text into an Attention layer, and outputting an Attention sequence after the Attention calculation;
inputting the Attention sequence into a convolution layer, performing convolution operation, and outputting a characteristic matrix C;
and inputting the feature matrix C into a pooling layer for sampling operation, and outputting the original text feature set.
4. The network emotion analysis method based on the fine-grained emotion dictionary, as recited in claim 3, wherein the step of performing nonlinear transformation on the original text feature set to output an F value, and the step of using the F value as an index of network emotion analysis comprises the steps of:
inputting the original text feature set into a multilayer perceptron, outputting emotion tag score vectors, performing Softmax calculation on the emotion tag score vectors, and outputting relative probabilities of the emotion tag score vectors;
and correspondingly setting the network emotion analysis method parameters according to the relative probability of the emotion label score vector, inputting the network emotion analysis method parameters into an algorithm for calculating an F value, and taking the output F value as an index of network emotion analysis.
5. A network emotion analysis device based on a fine-grained emotion dictionary is characterized by comprising:
the acquisition module is used for acquiring the web text, calculating the web text according to a fine-grained emotion dictionary method and outputting a word vector text;
the matching module is used for performing semantic acquisition processing on the word vector text and outputting an original text characteristic set matched with the network text;
and the output module is used for carrying out nonlinear transformation on the original text feature set to output an F value, and the F value is used as an index of network emotion analysis.
6. The apparatus for analyzing network emotion based on fine-grained emotion dictionary according to claim 5, wherein the obtaining module comprises:
the preprocessing unit is used for preprocessing the acquired network text through the crust segmentation and outputting the network preprocessed text;
the combination splicing unit is used for splicing word vectors of words in the network preprocessed text to obtain a network preprocessed word vector text, and the formula is as follows:
Figure FDA0002419995710000031
where Vi ∈ Vd × n indicates that ti corresponds to an element in the dictionary,
Figure FDA0002419995710000032
representing a row vector splicing operation;
a fusion unit for utilizing the network preprocessed word vector text and the fine-grained emotion dictionary by formula
Figure FDA0002419995710000033
Fused output word vector text X, where VPRepresenting part of speech information, VMRepresenting emotional information, VERepresenting an emotion.
7. The apparatus for analyzing network emotion based on fine-grained emotion dictionary according to claim 6, wherein the matching module comprises:
the Attention layer unit is used for inputting the word vector text into an Attention layer and outputting an Attention sequence after the Attention calculation;
the convolutional layer unit is used for inputting the Attention sequence into a convolutional layer and carrying out convolution operation to output a characteristic matrix C;
and the pooling layer unit is used for inputting the feature matrix C into a pooling layer for sampling operation and outputting the original text feature set.
8. The apparatus of claim 7, wherein the output module comprises:
the Softmax calculation unit is used for inputting the original text feature set into the multilayer perceptron, outputting emotion tag score vectors, performing Softmax calculation on the emotion tag score vectors and outputting relative probabilities of the emotion tag score vectors;
and the F value calculating unit is used for correspondingly setting the network emotion analysis method parameters according to the relative probability of the emotion label score vector, inputting the network emotion analysis method parameters into an algorithm for calculating the F value, and outputting the F value as an index of network emotion analysis.
9. An electronic device comprising a memory and a processor, wherein the memory is configured to store one or more computer instructions, and wherein the one or more computer instructions are executed by the processor to implement a fine-grained sentiment dictionary based network sentiment analysis method as claimed in any one of claims 1 to 4.
10. A computer-readable storage medium storing a computer program, wherein the computer program is configured to enable a computer to execute the method for network emotion analysis based on a fine-grained emotion dictionary according to any one of claims 1 to 4.
CN202010202982.8A 2020-03-20 2020-03-20 Network emotion analysis method based on fine-grained emotion dictionary Pending CN111414755A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010202982.8A CN111414755A (en) 2020-03-20 2020-03-20 Network emotion analysis method based on fine-grained emotion dictionary

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010202982.8A CN111414755A (en) 2020-03-20 2020-03-20 Network emotion analysis method based on fine-grained emotion dictionary

Publications (1)

Publication Number Publication Date
CN111414755A true CN111414755A (en) 2020-07-14

Family

ID=71494434

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010202982.8A Pending CN111414755A (en) 2020-03-20 2020-03-20 Network emotion analysis method based on fine-grained emotion dictionary

Country Status (1)

Country Link
CN (1) CN111414755A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113221534A (en) * 2021-05-25 2021-08-06 深圳和锐网络科技有限公司 Text emotion analysis method and device, electronic equipment and storage medium
CN115545026A (en) * 2022-10-13 2022-12-30 深圳占领信息技术有限公司 Network emotion analysis system based on fine-grained emotion dictionary

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007018234A (en) * 2005-07-07 2007-01-25 National Institute Of Information & Communication Technology Automatic feeling-expression word and phrase dictionary generating method and device, and automatic feeling-level evaluation value giving method and device
CN105426381A (en) * 2015-08-27 2016-03-23 浙江大学 Music recommendation method based on emotional context of microblog
CN107092596A (en) * 2017-04-24 2017-08-25 重庆邮电大学 Text emotion analysis method based on attention CNNs and CCR
US20190138599A1 (en) * 2017-11-09 2019-05-09 Conduent Business Services, Llc Performing semantic analyses of user-generated text content using a lexicon
CN109933664A (en) * 2019-03-12 2019-06-25 中南大学 A kind of fine granularity mood analysis improved method based on emotion word insertion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007018234A (en) * 2005-07-07 2007-01-25 National Institute Of Information & Communication Technology Automatic feeling-expression word and phrase dictionary generating method and device, and automatic feeling-level evaluation value giving method and device
CN105426381A (en) * 2015-08-27 2016-03-23 浙江大学 Music recommendation method based on emotional context of microblog
CN107092596A (en) * 2017-04-24 2017-08-25 重庆邮电大学 Text emotion analysis method based on attention CNNs and CCR
US20190138599A1 (en) * 2017-11-09 2019-05-09 Conduent Business Services, Llc Performing semantic analyses of user-generated text content using a lexicon
CN109933664A (en) * 2019-03-12 2019-06-25 中南大学 A kind of fine granularity mood analysis improved method based on emotion word insertion

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
姜飞,张辉,刘奕群,张敏,马少平: "THUIR-SentiSenti中文微博情绪分析评测报告", pages 1 - 6 *
张仰森,郑佳, 黄改娟,蒋玉茹: "基于双重注意力模型的微博情感分析方法", vol. 58, no. 2, pages 122 - 130 *
戴立武: "基于深度神经网络的中文情感分析研究", pages 53 - 54 *
蒋加伏,朱前飞: "Python程序设计基础", vol. 1, 北京:北京邮电大学出版社, pages: 247 *
陈欣;于俊洋;赵媛媛;: "基于CNN和B-LSTM的文本处理模型研究", no. 05, pages 109 - 114 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113221534A (en) * 2021-05-25 2021-08-06 深圳和锐网络科技有限公司 Text emotion analysis method and device, electronic equipment and storage medium
CN115545026A (en) * 2022-10-13 2022-12-30 深圳占领信息技术有限公司 Network emotion analysis system based on fine-grained emotion dictionary

Similar Documents

Publication Publication Date Title
Arulmurugan et al. RETRACTED ARTICLE: Classification of sentence level sentiment analysis using cloud machine learning techniques
Mahmood et al. Deep sentiments in roman urdu text using recurrent convolutional neural network model
Bhuvaneshwari et al. Sentiment analysis for user reviews using Bi-LSTM self-attention based CNN model
Zobeidi et al. Opinion mining in Persian language using a hybrid feature extraction approach based on convolutional neural network
Peng et al. Human–machine dialogue modelling with the fusion of word-and sentence-level emotions
CN111126067B (en) Entity relationship extraction method and device
Liu et al. R-trans: RNN transformer network for Chinese machine reading comprehension
CN112926308B (en) Method, device, equipment, storage medium and program product for matching text
Choi et al. Residual-based graph convolutional network for emotion recognition in conversation for smart Internet of Things
Banik et al. Gru based named entity recognition system for bangla online newspapers
CN111581364B (en) Chinese intelligent question-answer short text similarity calculation method oriented to medical field
Jin et al. Multi-label sentiment analysis base on BERT with modified TF-IDF
Jia Sentiment classification of microblog: A framework based on BERT and CNN with attention mechanism
CN115169361A (en) Emotion analysis method and related equipment thereof
CN111414755A (en) Network emotion analysis method based on fine-grained emotion dictionary
CN114547303A (en) Text multi-feature classification method and device based on Bert-LSTM
Samih et al. Enhanced sentiment analysis based on improved word embeddings and XGboost.
Chan et al. Optimization of language models by word computing
CN111681731A (en) Method for automatically marking colors of inspection report
CN115906824A (en) Text fine-grained emotion analysis method, system, medium and computing equipment
Ren et al. ABML: attention-based multi-task learning for jointly humor recognition and pun detection
Chen et al. Learning the chinese sentence representation with LSTM autoencoder
CN114443846A (en) Classification method and device based on multi-level text abnormal composition and electronic equipment
Vasili et al. Sentiment analysis on social media for Albanian language
Li et al. Emotion analysis for the upcoming response in open-domain human-computer conversation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination