CN114742070A - Text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution - Google Patents

Text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution Download PDF

Info

Publication number
CN114742070A
CN114742070A CN202210434984.9A CN202210434984A CN114742070A CN 114742070 A CN114742070 A CN 114742070A CN 202210434984 A CN202210434984 A CN 202210434984A CN 114742070 A CN114742070 A CN 114742070A
Authority
CN
China
Prior art keywords
word
layer
word vector
convolution
bit sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210434984.9A
Other languages
Chinese (zh)
Inventor
陈平华
林哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202210434984.9A priority Critical patent/CN114742070A/en
Publication of CN114742070A publication Critical patent/CN114742070A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution, which comprises the following steps of: obtaining an evaluation statement, inputting the evaluation statement into a trained bidirectional standard convolution network model with bit sequence information, and outputting a text emotion analysis result through the model; the bidirectional standard convolution network model with the bit sequence information comprises a word embedding layer, a bit sequence information layer, a word vector deformation layer, a convolution layer, a double-head structure and a classification layer; the word embedding layer is used for converting words in the evaluation sentences into word vectors which can be understood by a computer; the bit sequence information layer is used for adding bit sequence information to the word vectors to obtain word vectors with the bit sequence information; the word vector transformation layer is used for transforming the word vectors into a word block matrix and finally splicing the word blocks into a sentence matrix; the convolution layer is used for carrying out convolution operation on the sentence matrix to obtain word vector characteristics; and the classification layer is used for performing classification operation on the spliced word vector characteristics to obtain a final text emotion analysis result.

Description

Text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution
Technical Field
The invention relates to the field of natural language processing, in particular to a text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution.
Background
The convolution used in the emotion classification task of the current neural network model still has the following problems:
(1) the convolution kernel is typically n x D, where n is the number of word vectors calculated by the convolution kernel at one time and D is the word vector dimension. The word vector dimension is typically 200 to 500, so the convolution kernel is too large, resulting in large parameters.
(2) The convolution is performed in the form of a sliding window, which captures semantic information for adjacent words each time it is convolved, but not between more widely spaced words.
Disclosure of Invention
The invention aims to provide a text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution, which is used for solving the problems of a neural network in the existing emotion classification task.
In order to realize the task, the invention adopts the following technical scheme:
a text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution comprises the following steps:
obtaining an evaluation statement, inputting the evaluation statement into a trained bidirectional standard convolution network model with bit sequence information, and outputting a text emotion analysis result through the model; the bidirectional standard convolution network model with the bit sequence information comprises a word embedding layer, a bit sequence information layer, a word vector deformation layer, a convolution layer, a double-head structure and a classification layer;
the word embedding layer is used for converting words in the evaluation sentences into word vectors which can be understood by a computer, and words with similar meanings are mapped to similar positions in a vector space;
the bit sequence information layer is used for receiving the word vectors output by the word embedding layer and adding bit sequence information to the word vectors to obtain word vectors with the bit sequence information; the bit sequence information adopts the idea of performing attention mechanism on a channel, each word vector in the evaluation statement is convolved to generate a trainable bit sequence vector, and the trainable bit sequence vector is multiplied to the original word vector to strengthen the context relation;
the word vector deformation layer is used for receiving the word vectors with the bit sequence information, deforming the word vectors into a word block matrix, and finally splicing the word vectors into a sentence matrix;
the convolution layer is used for receiving the sentence matrix output by the word vector deformation layer and carrying out 3 multiplied by 3 convolution operation on the sentence matrix to obtain a first word vector characteristic;
the double-head structure comprises a branch parallel to the word embedding layer, the bit sequence information layer, the word vector deformation layer and the convolution layer; the word sequence of the evaluation statement is transferred and then input into the branch, and a second word vector characteristic output by the branch convolutional layer is obtained through the processing of the word embedding layer, the bit sequence information layer, the word vector deformation layer and the convolutional layer of the branch; splicing the first word vector characteristics and the second word vector characteristics to obtain spliced word vector characteristics;
and the classification layer is used for performing classification operation on the spliced word vector characteristics to obtain a final text emotion analysis result.
Further, the word embedding layer carries out word vector processing by adopting a wordtovector model, carries out word segmentation on the evaluation statement, and then expresses the words into a space vector form;
the word vector is generated by the word vector model by using three fully-connected layers to obtain the word vector expression of the evaluation statement: s ═ x1,x2,...,xn],xi∈Rd(i ═ 1,2, …, n), where R is the real number set and d represents the dimension of the word vector.
Further, to find the bit sequence information of the word vector, the learnable position value generated by sliding on the word vector representation S of the evaluation sentence by 1 as a step length by using n × 1 convolution kernel, and multiplying the learnable position value by the word matrix block to retain the position information; the calculation formula of the bit sequence information c of the word vector is as follows:
c=f(W1·xi+b1)
wherein x isiIs a word vector, W1Weight matrix being a convolution kernel, b1For bias, f is the activation function of the convolution kernel; the word vector with the added bit-order information is represented as:
x′i=c×xi
the finally obtained evaluation statement containing the bit sequence information is represented as follows: sp=[x′1,x′2,...,x′n]。
Further, for any word vector x 'with bit-order information'i=[k1,k2,...,kh×h]K is an element in a word vector, and the word vector has h multiplied by h elements; it is transformed into the following word block matrix:
Figure BDA0003612090920000021
then, the word block matrixes of all word vectors of the evaluation statement are spliced in line priority, if the word block matrixes are not enough to form a square matrix, all small h × h square matrixes which are 0 are filled, and the sentence matrixes after splicing:
Figure BDA0003612090920000031
further, the classification layer is realized by adopting a 2-layer fully-connected layer, the layer 1 adopts a fully-connected layer with the neuron number of 8 and the activation function of tanh function, the layer 2 adopts a fully-connected layer with the neuron number of 2 and the activation function of sigmoid.
Further, when the network model is trained, firstly, preparing an evaluation statement data set, and dividing the evaluation statement data set into a training set and a testing set; importing a training set into a network model for training, firstly segmenting sentences of the training set by using a segmentation toolkit, generating word vectors by using each segmented word through a secondary embedding layer, and then adding bit sequence information to each word vector by using a bit sequence information layer; the word vectors added with the bit sequence information are subjected to deformation processing in a word vector deformation layer, one-dimensional word vectors with the length equal to the dimensionality of the word vectors are converted into h x h word block matrixes, and the word block matrixes of all the word vectors in the sentence form a sentence matrix; finally, convolving the sentence matrix on the convolution layer by using a 3 multiplied by 3 convolution core to obtain word vector characteristics; in order to further retain the position information of the original sentence, word vector characteristics starved by a word embedding layer, a bit sequence information layer, a word vector deformation layer and a convolution layer after the sentence is in reverse order are obtained through a double-head structure, and after the two word vector characteristics are spliced, text emotion classification is carried out by using a full connection layer; and after the training set trains the network model, testing by using the test set to obtain a final network model.
The terminal equipment comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, wherein the processor realizes the steps of the text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution when executing the computer program.
A computer readable storage medium, storing a computer program, which when executed by a processor implements the steps of the aforementioned text sentiment analysis method based on word vector deformation and bi-directional bit-order convolution.
Compared with the prior art, the invention has the following technical characteristics:
the invention provides a bidirectional standard convolution network model Bi-2DCNN (bidirectional head convolution text classification model based on word vector definition with position information), which adopts standard 3 multiplied by 3 convolution kernel to extract features and further carry out text emotion analysis in order to reduce the parameter number of a convolution layer; the vector quantity of one-dimensional words is changed into a word matrix block form, matrix blocks of a plurality of words in sentences are spliced into a matrix, and information is lost as little as possible while the requirement of the convolutional layer input is met; in order to avoid losing the sequence information of the words in the process of word vector deformation splicing, adding the bit sequence information after the word vectors are changed into word matrix blocks; in order to further compensate for word position information loss caused by deformed splicing and better capture bidirectional semantic dependence between sentence words, a network structure with a double-head structure is provided, and further the text sentiment analysis effect is improved.
Drawings
Fig. 1 is a schematic structural diagram of a bidirectional standard convolutional network model with bit sequence information in the present invention.
Detailed Description
Referring to the attached drawings, the invention provides a text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution, which comprises the following steps:
obtaining an evaluation statement, inputting the evaluation statement into a trained Bi-directional standard convolutional network model (Bi-2DCNN) with bit sequence information, and outputting a text emotion analysis result through the model, wherein the Bi-directional standard convolutional network model with the bit sequence information comprises a word embedding layer, a bit sequence information layer, a word vector deformation layer, a convolutional layer, a double-head structure and a classification layer:
1. word embedding layer
The word embedding layer is used for converting words in the evaluation statement into word vectors which can be understood by a computer.
The word vector processing is carried out in the network model of the scheme by adopting a wordtovector model, the evaluation sentences are firstly segmented, then the words are expressed into a space vector form, and the words with similar meanings are mapped to similar positions in a vector space.
The word vector is generated by the word vector generation model by using three fully-connected layers to obtain word vector representation of the evaluation statement: s ═ x1,x2,...,xn],xi∈Rd(i ═ 1,2, …, n), where R is the real number set and d denotes the dimension of the word vector, i.e. each word vector x in the evaluation sentenceiExpressed in d real numbers; the evaluation statement S may be represented by x1:nMeaning, it is:
Figure BDA0003612090920000041
wherein
Figure BDA0003612090920000042
Represents a splicing operation, x1:n∈Rd×n
2. Bit-order information layer
And the bit sequence information layer is used for receiving the word vectors output by the word embedding layer and adding bit sequence information to the word vectors to obtain the word vectors with the bit sequence information.
The bit sequence information adopts the idea of performing attention mechanism on a channel, each word vector in the evaluation statement is convolved to generate a trainable bit sequence vector, and the trainable bit sequence vector is multiplied to the original word vector to strengthen the context relation.
To find the bit-order information of the word vector, the learnable position value generated by sliding with 1 as a step size on the word vector representation S of the evaluation sentence is used with an n × 1 convolution kernel, and multiplied by the word matrix block to retain the position information. Although the entire word vector is also convolved, the width of the convolution kernel is 1, and therefore the parameters of the convolution kernel are much smaller than those of the conventional model. The calculation formula of the bit sequence information c of the word vector is as follows:
c=f(W1·xi+b1)
wherein xiIs a word vector, W1Weight matrix being a convolution kernel, b1For bias, f is the activation function of the convolution kernel. The word vector with added bit-order information is represented as:
x′i=c×xi
the finally obtained evaluation statement containing the bit sequence information is expressed as follows: sp=[x′1,x′2,...,x′n]。
3. Word vector transformation layer
The word vector transformation layer is used for receiving the word vectors with the bit sequence information, transforming the word vectors into a word block matrix, and finally splicing the word block matrix into a sentence matrix.
The morphing operation ensures that as little information as possible is lost during the convolution of the word vector and ensures that the input meets the requirements of the convolution.
For any word vector x 'with bit-order information'i=[k1,k2,...,kh×h]K is an element in a word vector, and the word vector has h multiplied by h elements; it is transformed into the following word block matrix:
Figure BDA0003612090920000051
then, the word block matrixes of all word vectors of the evaluation statement are spliced in line priority, if the word block matrixes are not enough to form a square matrix, all small h × h square matrixes which are 0 are filled, and the sentence matrixes after splicing:
Figure BDA0003612090920000052
the word vector deformation layer ensures that the word vector convolution loses information as little as possible. For example: a length-9 word vector xi=[k1,k2,k3,k4,k5,k6,k7,k8,k9]If the convolution is performed by 3X3 directly, only k can be guaranteed1,k2,k3And calculating with a convolution kernel, and obtaining after deformation:
Figure BDA0003612090920000061
3x3 convolution can realize k by one convolution1,k2,k3,k4,k5,k6,k7,k8,k9And the calculation of a convolution kernel ensures that the word vector loses information as little as possible when performing 3 × 3 convolution.
4. Convolutional layer
The convolution layer is used for receiving sentence matrix S output by the word vector deformation layertAnd 3x3 convolution operation is carried out on the first word vector to obtain a first word vector characteristic C1
C1=f(W2·St+b2)
Wherein, W2Weight matrix being a convolution kernel, b2For bias, f is the activation function of the convolution kernel. The scheme chooses the convolution with 3 × 3 instead of the convolution of the whole word vector because the size of the convolution kernel for the convolution of the whole word vector is determined by the length of the word vector, and if the size of the word vector is 1000, 2 words are convoluted each time, and the parameter number of the convolution kernel is 2000. With a 3x3 convolution, the number of parameters for this convolution kernel is 9, thus greatly reducing the number of trainable parameters for the model.
5. Double-head structure
The double-head structure comprises a branch parallel to the word embedding layer, the bit sequence information layer, the word vector deformation layer and the convolution layer; the word order of the evaluation statementAfter order modulation, inputting the word into the branch, and processing the word embedding layer, the bit order information layer, the word vector deformation layer and the convolution layer in the same way as the above to obtain a second word vector characteristic C output by the convolution layer of the branch2
Vector feature C of the first word1And the second word vector feature C2And performing splicing operation to obtain spliced word vector characteristics.
After the phrase is deformed, the natural front and back sequence relation of the sentence is broken, and in order to avoid the deformed word block matrix losing the sequence information of the words in the sentence, a word bit sequence information layer is added into the network model; in order to further retain the position information of the original sentence, the model introduces a double-head structure, and bidirectional semantic dependence can be better captured through the double-head structure; the double-headed structure forms a second branch by sequentially turning the sentences and repeating the upper 4 layers, and then concatenates the two results before the classification layer.
6. A classification layer
And the classification layer is used for performing classification operation on the spliced word vector characteristics to obtain a final text sentiment analysis result.
Wherein the classification layer adopts 2 layers of full connection layer to realize, and 1 st layer adopts the neuron number to be 8, and the activating function is the full connection layer of tanh function, and 2 nd layer adopts the neuron number to be 2, and the activating function is sigmoid's full connection layer, and the formula can be expressed as:
O=tanh(Wh·c+bh)
result=sigmoid(Ws·O+bs)
wherein O, result is the output of the first and second fully-connected layers, Wh、WsA weight matrix of fully connected layers of the first and second layers, bh、bsAnd the offset vectors are the offset vectors of the fully connected layers of the first layer and the second layer.
When the network model is trained, firstly, preparing an evaluation statement data set, and dividing the evaluation statement data set into a training set and a test set; importing a training set into a network model for training, firstly segmenting sentences of the training set by using a segmentation toolkit, generating word vectors by each segmented word through a secondary embedding layer, and then adding bit sequence information to each word vector by using a bit sequence information layer; the word vectors added with the bit sequence information are subjected to deformation processing in a word vector deformation layer, one-dimensional word vectors with the length equal to the dimensionality of the word vectors are converted into h x h word block matrixes, and the word block matrixes of all the word vectors in the sentence form a sentence matrix; finally, performing convolution on the sentence matrix by using a convolution layer with a 3 multiplied by 3 convolution core to obtain word vector characteristics; in order to further retain the position information of the original sentence, word vector characteristics starved by a word embedding layer, a bit sequence information layer, a word vector deformation layer and a convolution layer after the sentence is in reverse order are obtained through a double-head structure, and after the two word vector characteristics are spliced, text emotion classification is carried out by using a full connection layer; and after the training set trains the network model, testing by using the test set to obtain a final network model.
Experiment 1
In order to verify that the parameter quantity of the 3x3 convolution is far from the parameter quantity of the previous model convolution, taking the IMBD data set as an example, the values of trainable parameters in different model convolution layers are listed.
Table 1 results of experiment 1
Figure BDA0003612090920000071
Table 1 shows the comparison of the trainable parameters of the convolutional layer and the bit sequence information layer in the Bi-2DCNN model with those of the convolutional layer of other models. From Table 1, it can be seen that the Bi-2DCNN model using the convolution of 3 × 3 uses less parameters than the conventional convolution of the whole word vector, and the effect of the Bi-2DCNN model can be better obtained from the following experiments.
Experiment 2
In order to test the influence of the sizes of convolution kernels in the convolution layer of the Bi-2DCNN model on the emotion analysis effect of the model text, ablation tests with different sizes of convolution kernels are carried out, and 4 test groups with different sizes of convolution kernels are set. The IMDB comments are used as a data set, and the number of convolution kernels is set to be 32 for experiment.
Table 2 results of experiment 2
Figure BDA0003612090920000081
Table 2 shows the effect of setting the convolution kernel in the convolution layer in the Bi-2DCNN model to different sizes on the IMDB dataset. As can be seen from Table 2, the Bi-2DCNN model with convolution kernel of 3 has better performance on the data set than other models, because the abstract features of word vectors are extracted after the data passes through the word embedding layer, and the small convolution kernel is used for carrying out small receptive field[18]The feature extraction of (2) has a better effect, and the parameters of a small convolution kernel model are less, so that the training time is shorter. The scheme adopts a Bi-2DCNN model with a convolution kernel of 3 to carry out experiments, and a good text emotion analysis effect is obtained.
Experiment 3
In order to test the influence of the Bi-2DCNN added bit sequence information module on the emotion analysis effect of the model text, ablation experiments with and without a bit sequence information layer are performed, and 3 groups of experiment groups are set, wherein the experiment groups are respectively in a mode of not adding bit sequence information, adopting matrix addition for the bit sequence information and adopting matrix multiplication for the bit sequence information. Experiments were performed with IMDB reviews as data sets.
Table 3 results of experiment 3
Figure BDA0003612090920000082
Table 3 shows the effect of the Bi-2DCNN model on the IMDB data set without and with the bit-order information module. As can be seen from table 3, the bit-order information obtained by matrix multiplication is better represented on the data set than other models because the feature vector output by the word embedding layer does not represent a word for every row after being deformed, and therefore, a bit-order information module is added after being deformed to retain the position information of the word vector. The network is made to know that each word matrix block is a word whole, so that the model can better learn the context information, and a good text sentiment analysis effect is obtained. Therefore, the scheme adopts the Bi-2DCNN model of the bit sequence information module introducing the matrix multiplication to carry out experiments, and obtains good text sentiment analysis effect.
Experiment 4
In order to test whether the Bi-2DCNN adopts a double-head structure to influence the emotion analysis result effect of the model text, an ablation test with or without the double-head structure is carried out, and 2 groups of test groups are arranged. The IMDB comments are taken as a data set, and other model structures are kept consistent to perform experiments.
Table 4 results of experiment 4
Figure BDA0003612090920000091
Table 4 shows the effect of the Bi-2DCNN model on the IMDB dataset without and with a dual-headed structure. As can be seen from Table 4, the model with the double-headed structure performs better on the data set than the model without the double-headed structure, because the double-headed structure can better capture the bidirectional semantic dependence, and the generated feature vector is closer to the true emotion polarity. Therefore, the scheme adopts the Bi-2DCNN model with a double-head structure to carry out experiments, and a better effect can be achieved.
Experiment 5
In order to test the influence of a word vector deformation layer in Bi-2DCNN on the emotion analysis effect of a model text, ablation tests with and without the word vector deformation layer are carried out, 2 groups of test groups without word vector deformation and with the word vector deformation layer are set, and the tests are carried out by taking IMDB comments as data sets.
Table 5 results of experiment 5
Figure BDA0003612090920000092
Table 5 shows the effect of whether the Bi-2DCNN model introduces a word vector deformation layer on the IMDB dataset. As can be seen from table 5, the Bi-2DCNN model introduced into the word vector deformation layer performs better on the data set than the model not introduced, because 3 × 3 convolution is performed, and the non-deformed convolution has two thirds to calculate information of other word vectors, and after deformation, the 3 × 3 convolution is guaranteed to calculate information of the same word vector as much as possible. According to the scheme, a Bi-2DCNN model introducing a word vector deformation layer is adopted for carrying out experiments, and a good text emotion analysis effect is obtained.
Experiment 6
And performing comparison experiments on the Bi-2DCNN model and the classical neural network model on data sets in 3 fields to verify the effectiveness of the model.
TABLE 6 multiple model comparison test results
Figure BDA0003612090920000101
Table 6 gives the experimental results of different models on the take-away review, movie review, and aviation review data sets. From the results it can be seen that the different models perform well on the aviation data set because the aviation data set is small and the dictionary is relatively large. The word vector expression capability obtained through the word embedding layer is good, so that the text emotion analysis effect of the airline data set is better than that of the other two data sets. The TextCNN model works poorly compared to others because this model proposes to stitch the convolution results earlier and simply for the entire word vector. Bi-LSTM and ATTLSTM have better effect compared with LSTM, because Bi-LSTM adopts a bidirectional structure, sequence characteristics in two directions are considered, and ATT-LSTM introduces an attention mechanism, so that the association of related phrases of the context of the feature vector is tighter, and the emotion polarity identification of the text is better and more accurate. And GRU has one more gate structure than LSTM so that the text emotion analysis effect is better than LSTM, but the calculation time is longer than LSTM. RCNN is the combination of RNN and CNN but it can be seen that this combination does not take advantage of RNN and CNN to the greatest extent. The Bi-2DCNN model works better than TextCNN because feature extraction using bit-ordered 3x3 convolutions works better than convolving the entire word vector. Through analysis of the experimental result, the effect of the Bi-2DCNN model in three different field data sets is superior to that of other models, and the Bi-2DCNN model has better text emotion analysis effect than a mainstream model.
The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (8)

1. A text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution is characterized by comprising the following steps:
obtaining an evaluation statement, inputting the evaluation statement into a trained bidirectional standard convolution network model with bit sequence information, and outputting a text emotion analysis result through the model; the bidirectional standard convolution network model with the bit sequence information comprises a word embedding layer, a bit sequence information layer, a word vector deformation layer, a convolution layer, a double-head structure and a classification layer;
the word embedding layer is used for converting words in the evaluation sentences into word vectors which can be understood by a computer, and words with similar meanings are mapped to similar positions in a vector space;
the bit sequence information layer is used for receiving the word vectors output by the word embedding layer and adding bit sequence information to the word vectors to obtain word vectors with the bit sequence information; the bit sequence information adopts the idea of performing attention mechanism on a channel, each word vector in the evaluation statement is convolved to generate a trainable bit sequence vector, and the trainable bit sequence vector is multiplied to the original word vector to strengthen the context relation;
the word vector deformation layer is used for receiving word vectors with bit sequence information, deforming the word vectors into a word block matrix and finally splicing the word block matrix into a sentence matrix;
the convolution layer is used for receiving the sentence matrix output by the word vector deformation layer and carrying out 3 multiplied by 3 convolution operation on the sentence matrix to obtain a first word vector characteristic;
the double-head structure comprises a branch parallel to the word embedding layer, the bit sequence information layer, the word vector deformation layer and the convolution layer; after the word sequence of the evaluation statement is transferred, inputting the word sequence into the branch, and processing the word sequence through the word embedding layer, the bit sequence information layer, the word vector deformation layer and the convolution layer of the branch to obtain a second word vector characteristic output by the convolution layer of the branch; splicing the first word vector characteristics and the second word vector characteristics to obtain spliced word vector characteristics;
and the classification layer is used for performing classification operation on the spliced word vector characteristics to obtain a final text emotion analysis result.
2. The text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution as claimed in claim 1, characterized in that said word embedding layer performs word vector processing using a wordtovector model, performs word segmentation on an evaluation sentence, and then represents the word in a space vector form;
the word vector is generated by the word vector generation model by using three fully-connected layers to obtain word vector representation of the evaluation statement: s ═ x1,x2,...,xn],xi∈Rd(i ═ 1,2, …, n) where R is the set of real numbers and d denotes the dimension of the word vector.
3. The method for analyzing emotion of text based on word vector deformation and bidirectional bit-sequence convolution of claim 1, wherein, for obtaining bit-sequence information of a word vector, an nx1 convolution kernel is used to represent learnable position values generated by sliding on S by 1 step length on the word vector of an evaluation sentence, and the learnable position values are multiplied by a word matrix block to retain the position information; the calculation formula of the bit sequence information c of the word vector is as follows:
c=f(W1·xi+b1)
wherein xiIs a word vector, W1Weight matrix being a convolution kernel, b1For bias, f is the activation function of the convolution kernel; adding bit sequencesThe word vector for the information is represented as:
x′i=c×xi
the finally obtained evaluation statement containing the bit sequence information is expressed as follows: s. thep=[x′1,x′2,...,x′n]。
4. The method for analyzing emotion of text based on word vector deformation and bidirectional bit-sequence convolution of claim 1, wherein any bit-sequence information-carrying word vector x'i=[k1,k2,...,kh×h]K is an element in a word vector, and the word vector has h multiplied by h elements; it is transformed into a word block matrix as follows:
Figure FDA0003612090910000021
then, the word block matrixes of all word vectors of the evaluation statement are spliced in line priority, if the word block matrixes are not enough to form a square matrix, all small h × h square matrixes which are 0 are filled, and the sentence matrixes after splicing:
Figure FDA0003612090910000022
5. the text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution as claimed in claim 1, wherein the classification layer is implemented by using 2 fully-connected layers, the layer 1 is a fully-connected layer with the number of neurons being 8 and the activation function being a tanh function, the layer 2 is a fully-connected layer with the number of neurons being 2 and the activation function being a sigmoid.
6. The text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution as claimed in claim 1, characterized in that, when the network model is trained, firstly, an evaluation sentence data set is prepared and divided into a training set and a test set; importing a training set into a network model for training, firstly segmenting sentences of the training set by using a segmentation toolkit, generating word vectors by using each segmented word through a secondary embedding layer, and then adding bit sequence information to each word vector by using a bit sequence information layer; the word vectors added with the bit sequence information are subjected to deformation processing in a word vector deformation layer, one-dimensional word vectors with the length equal to the dimensionality of the word vectors are converted into an h multiplied by h word block matrix, and the word block matrices of all the word vectors in the sentence form a sentence matrix; finally, performing convolution on the sentence matrix by using a convolution layer with a 3 multiplied by 3 convolution core to obtain word vector characteristics; in order to further retain the position information of the original sentence, word vector characteristics starved by a word embedding layer, a bit sequence information layer, a word vector deformation layer and a convolution layer after the sentence is in reverse order are obtained through a double-head structure, and after the two word vector characteristics are spliced, text emotion classification is carried out by using a full connection layer; and after the training set trains the network model, testing by using the test set to obtain a final network model.
7. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the text sentiment analysis method based on word vector deformation and bi-directional bit-order convolution according to any one of claims 1-6 when executing the computer program.
8. A computer-readable storage medium, in which a computer program is stored, wherein the computer program, when being executed by a processor, implements the steps of the method for emotion analysis of a text based on word vector deformation and bi-directional bit-order convolution according to any of claims 1-6.
CN202210434984.9A 2022-04-24 2022-04-24 Text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution Pending CN114742070A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210434984.9A CN114742070A (en) 2022-04-24 2022-04-24 Text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210434984.9A CN114742070A (en) 2022-04-24 2022-04-24 Text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution

Publications (1)

Publication Number Publication Date
CN114742070A true CN114742070A (en) 2022-07-12

Family

ID=82283621

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210434984.9A Pending CN114742070A (en) 2022-04-24 2022-04-24 Text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution

Country Status (1)

Country Link
CN (1) CN114742070A (en)

Similar Documents

Publication Publication Date Title
CN111611377B (en) Knowledge distillation-based multi-layer neural network language model training method and device
US11615249B2 (en) Multitask learning as question answering
CN106650813B (en) A kind of image understanding method based on depth residual error network and LSTM
CN111275007B (en) Bearing fault diagnosis method and system based on multi-scale information fusion
CN108830287A (en) The Chinese image, semantic of Inception network integration multilayer GRU based on residual error connection describes method
Kamal et al. Textmage: The automated bangla caption generator based on deep learning
WO2020211611A1 (en) Method and device for generating hidden state in recurrent neural network for language processing
CN110968725B (en) Image content description information generation method, electronic device and storage medium
CN109918507B (en) textCNN (text-based network communication network) improved text classification method
CN110705490A (en) Visual emotion recognition method
CN114491289A (en) Social content depression detection method of bidirectional gated convolutional network
CN114170657A (en) Facial emotion recognition method integrating attention mechanism and high-order feature representation
CN110990630B (en) Video question-answering method based on graph modeling visual information and guided by using questions
CN112560440A (en) Deep learning-based syntax dependence method for aspect-level emotion analysis
CN114742070A (en) Text emotion analysis method based on word vector deformation and bidirectional bit sequence convolution
CN113010717B (en) Image verse description generation method, device and equipment
CN113190681A (en) Fine-grained text classification method based on capsule network mask memory attention
CN113239678A (en) Multi-angle attention feature matching method and system for answer selection
Chen et al. Text classification based on a new joint network
CN117726721B (en) Image generation method, device and medium based on theme drive and multi-mode fusion
CN110888996A (en) Text classification method based on range convolution neural network
Jethwa et al. Comparative analysis between InceptionResNetV2 and InceptionV3 for attention based image captioning
CN110543569A (en) Network layer structure for short text intention recognition and short text intention recognition method
Yanagimoto et al. Visual Question Answering Focusing on Object Positional Relation with Capsule Network
CN113792703B (en) Image question-answering method and device based on Co-Attention depth modular network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination