CN115510230A - Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism - Google Patents

Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism Download PDF

Info

Publication number
CN115510230A
CN115510230A CN202211145422.9A CN202211145422A CN115510230A CN 115510230 A CN115510230 A CN 115510230A CN 202211145422 A CN202211145422 A CN 202211145422A CN 115510230 A CN115510230 A CN 115510230A
Authority
CN
China
Prior art keywords
fusion
matrix
feature
vector
mongolian
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211145422.9A
Other languages
Chinese (zh)
Inventor
苏依拉
赵梦莹
仁庆道尔吉
吉亚图
乌尼尔
路敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inner Mongolia University of Technology
Original Assignee
Inner Mongolia University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inner Mongolia University of Technology filed Critical Inner Mongolia University of Technology
Priority to CN202211145422.9A priority Critical patent/CN115510230A/en
Publication of CN115510230A publication Critical patent/CN115510230A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

A Mongolian emotion analysis method based on multi-dimensional feature fusion and a comparative reinforcement learning mechanism comprises the steps of preprocessing Mongolian emotion corpus, carrying out multi-dimensional feature representation on the preprocessed Mongolian emotion corpus, and then carrying out multi-dimensional feature attention fusion to obtain a fusion feature matrix F; extracting a topic lexicon from the F, and obtaining a topic feature vector S through CNN model training; inputting F and S into a TBGRU model to obtain text semantic information R; carrying out attention fusion on common semantic features of the R and the S; and acquiring text emotion classification information by using a comparison reinforcement learning mechanism according to the fusion result. The method can realize accurate emotion analysis on Mongolian texts.

Description

Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism
Technical Field
The invention belongs to the technical field of artificial intelligence, relates to emotion analysis of natural language processing, and particularly relates to a Mongolian emotion analysis method based on a multi-dimensional feature fusion and comparative reinforcement learning mechanism.
Background
The emotion analysis is used as a basic task in the field of natural language processing and computer linguistics and aims to judge the emotional tendency expressed by the text. The emotion analysis technology mainly analyzes texts with three granularities of chapter level, sentence level and aspect level, and an analysis method for taking the whole article as a research object is relatively rough and only can judge whether the overall emotional tendency expressed by the article is positive or negative. The emotion analysis method using sentences as units can identify the whole emotion expressed by the sentences, but cannot judge the emotion polarity of the target words contained in the sentences. Aspect level sentiment analysis is a fine-grained sentiment analysis aiming at analyzing the sentiment tendency of a specific entity or attribute in a sentence. In the past research, the text information contained in the whole paragraph or sentence is more, but the emotional polarity of the text information is relatively single, only positive or negative emotional tendency can be obtained, and the emotional attitude of a reviewer to a certain entity cannot be accurately analyzed. The judgment of the emotion polarity of the text not only depends on the text information in the sentence, but also is closely related to the expression of a specific aspect in the text. Therefore, different words in a sentence have different influences on the judgment of the emotion polarity of the text. For most text contents, it is meaningless to give only one general emotional tendency, and a more detailed analysis result needs to be obtained, so that the comment information can be comprehensively known, and more correct selection can be made.
With the unique advantages of deep learning in the field of natural language processing, many researchers propose an aspect emotion analysis model based on a Recurrent Neural Network (RNN), but a single recurrent neural network cannot capture the relevance between aspect words and key information in sentences, so many researchers are dedicated to introduce attention mechanism to solve. Wang et al concatenates semantic vectors with aspect vectors to extract emotional features in a hidden layer of a long-short term memory network (LSTM) in combination with an attention mechanism. Tang et al constructs text word vectors as external memory for attention learning, and obtains emotional features of aspects through iterative computation of multi-layer attention. Zhang et al use convolutional and recurrent neural network models to implement short text emotion classification tasks, use convolutional neural networks to generate coarse-grained feature representations, and learn word long-distance dependency information using long-short term memory networks. These studies, while all combine various advantages of RNN and CNN, because of the simple text features input into the training model, the extracted text semantic information is insufficient, and particularly the emotion classification accuracy for implicit emotions is low.
Mongolian is the national language of Mongolian in Mongolian autonomous region in China. As a main term for the communication between Mongolian people, mongolian plays an important role in the development of the Mongolian autonomous region in political, economic, cultural and social fields. However, the emotion analysis research on Mongolian starts late, and the morphology change of Mongolian is more complicated than English, chinese and other languages and grammars, and Mongolian corpus is relatively deficient. Therefore, emotion analysis research based on Mongolian is necessary.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide a Mongolian emotion analysis method based on a multidimensional feature fusion and comparison reinforcement learning mechanism, which is characterized in that attention multi-feature fusion is carried out on word vectors, part-of-speech features, position features and syntactic dependency features, a topic word library is extracted, then the fusion features are input into an improved TBGRU model to obtain semantic information, attention features are combined with the topic word library to fuse semantic feature information, and finally a comparison reinforcement mechanism is used for obtaining text aspect emotion classification information.
In order to achieve the purpose, the invention adopts the technical scheme that:
a Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism comprises the following steps:
step 1, preprocessing Mongolian emotion corpus, carrying out multi-dimensional feature representation on the preprocessed Mongolian emotion corpus, and then carrying out multi-dimensional feature attention fusion to obtain a fusion feature matrix F;
step 2, extracting a topic lexicon from the fusion feature matrix F, and obtaining a topic feature vector through CNN model training;
step 3, inputting the fusion feature matrix F and the theme feature vector into a TBGRU model to obtain text semantic information;
step 4, performing attention fusion on the output result of the TBGRU model and the common semantic features of the topic feature vector;
and 5, acquiring text emotion classification information by using a comparison reinforcement learning mechanism according to the fusion result.
Compared with the prior art, the invention has the beneficial effects that:
firstly, the invention improves the problem that the prior text feature extraction is incomplete, provides a multi-dimensional feature representation method and can more fully extract the text feature. Secondly, the invention improves and combines the RNN and CNN models, extracts and embeds the theme characteristics, can more fully extract text semantic information, and can more accurately classify implicit themes. Thirdly, the invention introduces a comparative reinforcement learning mechanism during classification, compared with the prior classification method, classification by using a comparative reinforcement learning mechanism can replace a large number of complex calculations.
Drawings
FIG. 1 is a schematic overall flow diagram of the present invention.
FIG. 2 is a multi-dimensional feature fusion framework diagram
FIG. 3 is a diagram of a TBGRU model.
Fig. 4 is a diagram of a comparative reinforcement learning mechanism model.
Detailed Description
The embodiments of the present invention will be described in detail below with reference to the drawings and examples.
The invention relates to a Mongolian emotion analysis method based on a multi-dimensional feature fusion and comparative reinforcement learning mechanism. The first part is multi-dimensional feature representation, and the first part is to perform multi-dimensional feature representation on word vectors and then perform multi-dimensional feature attention fusion to obtain a multi-dimensional fusion feature matrix. And the second part is model training, wherein the model training method comprises the steps of extracting a theme characteristic vector from the fusion characteristic matrix, and then inputting the theme characteristic vector and the fusion characteristic matrix into the TBGRU model for training to obtain text semantic information. And the third part is feature fusion emotion classification, text semantic information is acquired in a TBGRU model in a model training stage, attention fusion is carried out on the theme feature vector and the text semantic information by using an attention semantic feature fusion method, and a result of aspect-level emotion classification is obtained according to a fusion result.
As shown in fig. 1, the method of the present invention specifically includes the following steps:
step 1, preprocessing Mongolian emotion linguistic data, carrying out multi-dimensional feature representation on the preprocessed Mongolian emotion prediction, and then carrying out multi-dimensional feature attention fusion to obtain a fusion feature matrix F.
The preprocessing of the invention is to firstly carry out data cleaning operation on Mongolian emotion corpus and then carry out word segmentation operation. For the word segmentation operation, the minimum units are separated according to each Mongolian word. For example, the corresponding Mongolian language is expressed as
Figure BDA0003855362140000041
Figure BDA0003855362140000042
The result after word segmentation is
Figure BDA0003855362140000043
And then carrying out multi-dimensional feature representation on the text, wherein the method comprises the steps of carrying out multi-dimensional feature representation on word vectors and then carrying out multi-dimensional feature attention fusion to obtain multi-dimensional feature fusion word vectors. The method comprises the following specific steps:
and 1.1, multi-dimensional feature representation.
The part-of-speech, word-to-word positional relationships, and syntactic dependencies of emotional words in a corpus of text are important, as exemplified by
Figure BDA0003855362140000044
(the product has poor sound quality but good battery life) although in the first half
Figure BDA0003855362140000045
The polarity of (tone quality) is negative, but in the latter half of the sentence
Figure BDA0003855362140000046
The polarity of (battery life) is positive. Therefore, as shown in fig. 2, by adding the part of speech of the emotion words in the text corpus, the position relationship between the words and the syntactic dependency relationship features into the word vector, information at a deeper level implied by the text semantics can be mined from a multidimensional level.
Let the word vector of the t-th word in the sentence be e t Setting part-of-speech characteristic vector, position characteristic vector and syntactic dependency characteristic vector corresponding to t-th word as s t 、t t 、q t The words in the sentence are spliced together, and the specific method of expressing each feature vector matrix is shown as the following formula:
Figure BDA0003855362140000047
Figure BDA0003855362140000048
Figure BDA0003855362140000049
Figure BDA00038553621400000410
wherein a represents the length of a sentence, t is more than or equal to 1 and less than or equal to a, c, d, b and k respectively represent the dimensionality of a word vector matrix, a part-of-speech characteristic vector matrix, a position characteristic vector matrix and a syntax dependence characteristic vector matrix, and P c 、P d 、P b And P k Respectively representing the word, part of speech, position and syntax dependence characteristic vector splicing matrix of the sentence,
Figure BDA0003855362140000051
representing a vector stitching operation, the output matrix after multi-dimensional feature representation is y, y = P c +P d +P b +P k
And step 1.2, multi-dimensional feature attention fusion.
After obtaining the multi-dimensional feature representation of the sentence, if only simple feature concatenation is available, for example:
P c+d =P c +P d this can hardly take into account the differences between each feature, and the spliced matrix increases the vector dimension, which puts some stress on the model training. As shown in FIG. 2, the invention uses the attention mechanism to focus on the feature information of the object, and can find the words with larger contribution degree in the text sentence, and better capture the semantic related information of the text context. The calculation method of the multi-dimensional feature attention fusion is as follows:
M(y i )=tanh(Wy i +b)
Figure BDA0003855362140000052
Figure BDA0003855362140000053
wherein tanh represents the activation function, y i Represents the ith vector in matrix y, M (y) i ) Represents a vector y i Weights of corresponding features, W represents a weight matrix, b represents a bias matrix, β i Represents M (y) i ) Output via SoftMax, f i A fused feature vector representing the ith word. n represents the number of vectors in the matrix y. Fusing all the features through attention features to obtain a fused feature matrix F = [ F = 1 ,f 2 ,...,f i ,...,f n ]。
Step 2, extract the subject thesaurus from the fusion feature matrix F, for example
Figure BDA0003855362140000054
(tone quality)
Figure BDA0003855362140000055
Figure BDA0003855362140000056
(battery life) belongs to the topic word, and the topic feature vector is obtained through CNN model training.
According to the invention, on the basis of fusing the feature matrix F, the subject term is embedded into the model. Specifically, in the fusion feature matrix F, SS-LDA is selected, a plurality of aspect topics appearing in a sentence are extracted, and a topic word bank s = [ s ] = 1 、s 2 、...、s i 、...、s m ]Wherein s is i The i-th subject word is represented, and m represents the number of the subject words. In the example
Figure BDA0003855362140000057
Figure BDA0003855362140000061
(this product has poor sound quality but good battery life) has three subject words
Figure BDA0003855362140000062
And
Figure BDA0003855362140000063
in the embedding process of the subject word bank, the subject word bank s may be formed by one or more words, the convolution operation is firstly carried out on the subject word bank s, the subject word bank s is input into the CNN model, and the subject features are extracted through the convolution and pooling operationsSign u, the formula is as follows:
u=f relu (s*W u +b u )
wherein f is relu Representing an activation function, W u Is a convolution kernel of c x m, b u Is the offset value.
Sampling the theme features u by adopting a maximum pooling method to obtain a theme feature vector S = [ S ] 1 、S 2 、...、S i 、...、S M ]Wherein S is i And M represents the number of the ith theme features obtained after sampling. In the example
Figure BDA0003855362140000064
Figure BDA0003855362140000065
(the product has poor sound quality but good battery life) the last theme feature left is
Figure BDA0003855362140000066
And
Figure BDA0003855362140000067
and 3, inputting the fusion feature matrix F and the topic feature vector S into a TBGRU model to obtain text semantic information.
The TBGRU model is characterized in that a theme feature vector is embedded on the basis of the GRU model, as shown in figure 3, a fusion feature matrix F is input into the TBGRU model, the theme feature vector is embedded into the TBGRU model for model training, attention is added in the model training process for weight distribution, and text semantic information representation is obtained.
The TBGRU model work flow is as follows:
inputting: and (5) carrying out multi-dimensional fusion on the feature matrix F and the theme feature vector S.
And (3) outputting: the text semantic information represents R.
(1) Inputting F into TBGRU model, coding the input F by bidirectional GRU, and inputting each word to be hiddenHidden state h i Represented by a forward hidden state and a backward hidden state connection, the acquisition is in h i Context information for the entire sentence that is centered.
(2) Embedding S into TBGRU model, and calculating h by using MLP i Projection v of i Calculating v i And S i The dot product of (a) generates an embedding weight beta of each topic feature vector corresponding to the fused feature vector position information i Final output topic embedding p i The calculation method is shown in the following formula.
v i =tanh(W a h i )
β i =softmax(v i S i )
Figure BDA0003855362140000071
Wherein, W a A matrix of weights is represented by a matrix of weights,
Figure BDA0003855362140000072
expressed as the product of a scalar and a vector.
(3) P is to be i The embedded bidirectional GRU formula changes the original bidirectional GRU formula into the following formula.
q i =σ(W q f i +U q h i-1 +V q p i-1 )
z i =σ(W z f i +U z h i-1 +V z p i-1 )
Figure BDA0003855362140000073
Figure BDA0003855362140000074
Wherein σ (·) denotes a sigmoid function, W q 、W z 、W h 、U q 、U z 、U h 、V q 、V z 、V h All are weight matrixes, are network parameters learned through a model, and are different in dimensionality. Indicating a product between vectors, q i Denotes a reset gate, z i Represents an update gate, h i Representing the previous hidden state h i-1 And current candidate state
Figure BDA0003855362140000075
With hidden state of interpolation between, subject embedding p i Representative is the current topic embedding state, p i-1 The previous topic embedding state representing the current state.
(4) And (4) introducing an attention mechanism to carry out weight distribution on the word vectors, wherein the calculation method is shown in the following formula.
r i =tanh(W r h i +b e )
α i =softmax(r i V r )
Figure BDA0003855362140000076
Wherein alpha is i Denotes h i R represents the text semantic representation after weight assignment. W r ,b r Is a parameter to be learned in training, V r Is a weight matrix, N is the number of vectors in the hidden state matrix h, r i Representing the vector that is output through the activation function.
And 4, performing attention fusion on the output result of the TBGRU model and the common semantic features of the topic feature vector.
Firstly, the cosine similarity is used for respectively calculating the similarity between S and R, namely calculating S i And the ith text semantic representation R after weight distribution i The similarity between features with high similarity is given a higher weight, and the similarity between features with low similarity is given a low weight, i.e. the corresponding features are given a high to low weight according to the similarity. Normalizing and summing the obtained similarity weights by using a SoftMax function to obtain an attention weight gamma i . Finally, gamma is measured i And R i Obtaining the semantic fusion feature vector mu of the ith word by weighting operation i The calculation method is shown in the following formula:
f(S i ,R i )=cos(S i ,R i )
Figure BDA0003855362140000081
Figure BDA0003855362140000082
μ=[μ 12 ,...,μ M ]
wherein, f (S) i ,R i ) Expressing by means of a cosine function i And R i And mu represents a semantic fusion feature matrix.
And 5, acquiring text emotion classification information by using a comparison reinforcement learning mechanism according to the fusion result.
The comparative reinforcement learning mechanism compares with the sample for each mu i Scoring, the method is as follows:
an equal number of positive and negative samples are randomly selected from the labeled training data, and then sentence vectors corresponding to the samples are generated, as shown in fig. 4. And obtaining the classified similarity score by taking the hidden layer neural network as a similarity function. The input of the neural network is the concatenation of the semantically fused feature vector mu and all sample vectors. The output layer size is 1 and the hidden layer V is the length of the sentence vector. The similarity Score was calculated as follows:
Score=W 2 (W 1 Concat(sample,μ)+s 1 )+s 2
the neural network is expressed by two linear transformations and offsets, and the parameters are respectively W 1 、s 1 、W 2 、s 2 Where Sample represents a Sample vector. In the examples
Figure BDA0003855362140000083
Figure BDA0003855362140000084
(the product has poor sound quality but good battery life) and finally the product can be obtained
Figure BDA0003855362140000085
The polarity of (a) is negative,
Figure BDA0003855362140000086
is positive.

Claims (8)

1. A Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism is characterized by comprising the following steps:
step 1, preprocessing Mongolian emotion corpus, carrying out multi-dimensional feature representation on the preprocessed Mongolian emotion corpus, and then carrying out multi-dimensional feature attention fusion to obtain a fusion feature matrix F;
step 2, extracting a topic lexicon from the fusion feature matrix F, and obtaining a topic feature vector through CNN model training;
step 3, inputting the fusion feature matrix F and the theme feature vector into a TBGRU model to obtain text semantic information;
step 4, performing attention fusion on the output result of the TBGRU model and the common semantic features of the topic feature vector;
and 5, acquiring text emotion classification information by using a comparison reinforcement learning mechanism according to the fusion result.
2. The Mongolian emotion analysis method based on multi-dimensional feature fusion and comparison reinforcement learning mechanism as claimed in claim 1, wherein said step 1, preprocessing comprises:
data cleansing and word segmentation.
3. The Mongolian emotion analysis method based on multi-dimensional feature fusion and comparison reinforcement learning mechanism as claimed in claim 1, wherein the multi-dimensional feature representation is to add the part of speech of emotion words in the corpus, the position relationship between words and the syntactic dependency relationship features into word vectors to dig out deeper information implied by text semantics from multi-dimensional layers;
let the word vector of the t-th word in the sentence be e t Setting part-of-speech characteristic vector, position characteristic vector and syntactic dependency characteristic vector corresponding to t-th word as s t 、t t 、q t The words in the sentence are spliced together, and the specific method of expressing each feature vector matrix is shown as the following formula:
Figure FDA0003855362130000011
Figure FDA0003855362130000012
Figure FDA0003855362130000013
Figure FDA0003855362130000021
wherein a represents the length of a sentence, t is more than or equal to 1 and less than or equal to a, c, d, b and k respectively represent the dimensionality of a word vector matrix, a part-of-speech characteristic vector matrix, a position characteristic vector matrix and a syntax dependency characteristic vector matrix, and P c 、P d 、P b And P k Respectively representing the word, part of speech, position and syntax dependence characteristic vector splicing matrix of the sentence,
Figure FDA0003855362130000022
representing a vector stitching operation, the output matrix after multi-dimensional feature representation is y, y = P c +P d +P b +P k
The multidimensional feature attention fusion and the calculation method are as follows:
M(y i )=tanh(Wy i +b)
Figure FDA0003855362130000023
Figure FDA0003855362130000024
wherein tanh represents the activation function, y i Represents the ith vector in matrix y, M (y) i ) Represents a vector y i Weights of corresponding features, W represents a weight matrix, b represents a bias matrix, β i Represents M (y) i ) Output via SoftMax, f i A fused feature vector representing the ith word; n represents the number of vectors in the matrix y; the fused feature matrix F is denoted as F = [ F 1 ,f 2 ,...,f i ,...,f n ]。
4. The Mongolian emotion analysis method based on multi-dimensional feature fusion and comparison reinforcement learning mechanism as claimed in claim 3, wherein in said step 2, SS-LDA is selected in the fusion feature matrix F to extract all the multiple aspect topics appearing in a sentence to form a topic lexicon s = [ s ] = 1 、s 2 、...、s i 、...、s m ]Wherein s is i The ith subject term is represented, and m represents the number of the subject terms; inputting a topic thesaurus s into a CNN model, and extracting topic characteristics u through convolution and pooling operations, wherein the formula is as follows:
u=f relu (s*W u +b u )
wherein f is relu Representing an activation function, W u Is a convolution kernel of c x m, b u Is a bias value;
sampling the theme characteristics u by adopting a maximum pooling method to obtain a themeEigenvector S = [ S ] 1 、S 2 、...、S i 、...、S M ]Wherein S is i And M represents the number of the ith theme features obtained after sampling.
5. The Mongolian emotion analysis method based on multi-dimensional feature fusion and comparison reinforcement learning mechanism as claimed in claim 4, wherein in said step 3, the TBGRU model embeds the subject feature vector based on the GRU model, the fusion feature matrix F is inputted into said TBGRU model, and at the same time, the subject feature vector is embedded into the TBGRU model for model training, and in the process of model training, attention is added for weight assignment to obtain text semantic information representation.
6. The Mongolian emotion analysis method based on multi-dimensional feature fusion and comparison reinforcement learning mechanism as claimed in claim 5, wherein the flow of step 3 is as follows:
(1) Inputting the fusion feature matrix F into a TBGRU model, coding the fusion feature matrix F by using a bidirectional GRU, and inputting the hidden state h of each word i Represented by a forward hidden state and a backward hidden state connection, the acquisition is in h i Contextual information of the entire sentence as a center;
(2) Embedding the subject feature vector S into a TBGRU model, and calculating h by using MLP i Projection v of i Calculating v i And S i The dot product of (a) generates an embedding weight beta of the subject feature vector corresponding to the fused feature vector position information i Final output topic embedding p i The calculation method is shown in the following formula:
v i =tanh(W a h i )
β i =softmax(v i S i )
Figure FDA0003855362130000031
wherein,W a A matrix of weights is represented by a matrix of weights,
Figure FDA0003855362130000032
expressed as the product of a scalar and a vector;
(3) P is to be i The embedded bidirectional GRU formula changes the original bidirectional GRU formula into the following formula:
q i =σ(W q f i +U q h i-1 +V q p i-1 )
z i =σ(W z f i +U z h i-1 +V z p i-1 )
Figure FDA0003855362130000033
Figure FDA0003855362130000034
wherein σ (·) denotes a sigmoid function, W q 、W z 、W h 、U q 、U z 、U h 、V q 、V z 、V h All the weight matrixes are network parameters learned through a model, and are different in dimensionality;
indicating a product between vectors, q i Denotes a reset gate, z i Represents an update gate, h i Indicating the previous hidden state h i-1 And current candidate state
Figure FDA0003855362130000041
A hidden state of interpolation; wherein p is i Representative is the current topic embedding state, p i-1 A previous topic embedding state representing a current state;
(4) And (4) introducing an attention mechanism to carry out weight distribution on the word vectors, wherein the calculation method is shown in the following formula.
r i =tanh(W r h i +b r )
α i =softmax(r i V r )
Figure FDA0003855362130000042
Wherein alpha is i Represents h i R represents the semantic representation of the text after weight assignment; w r ,b r Is a parameter to be learned in training, V r Is a weight matrix, N is the number of vectors in the hidden state matrix h, r i Representing the vector that is output through the activation function.
7. The Mongolian emotion analysis method based on multi-dimensional feature fusion and comparison reinforcement learning mechanism as claimed in claim 5, wherein in said step 4, cosine similarity calculation S is firstly used i And the ith text semantic representation R after weight distribution i The similarity between the two groups is given to the corresponding features from high to low according to the similarity from high to low, and then the obtained similarity weights are subjected to normalized summation to obtain the attention weight gamma i And finally gamma is i And R i Obtaining the semantic fusion feature vector mu of the ith word by weighted operation i The calculation method is shown in the following formula:
f(S i ,R i )=cos(S i ,R i )
Figure FDA0003855362130000043
Figure FDA0003855362130000044
μ=[μ 12 ,...,μ M ]
wherein, f (S) i ,R i ) Expressing S by cosine function i And R i And mu represents a semantic fusion feature matrix.
8. The Mongolian emotion analysis method based on multi-dimensional feature fusion and comparison reinforcement learning mechanism as claimed in claim 7, wherein said step 5, the comparison reinforcement learning mechanism compares with the sample to μ i Scoring, the method is as follows:
selecting positive samples and negative samples with equal quantity from the marked training data by adopting a random method, and then generating sentence vectors corresponding to the samples; obtaining classified similarity scores by taking a hidden layer neural network as a similarity function, wherein the input of the neural network is the splicing of mu and all sample vectors; the output layer size is 1, the hidden layer V is the length of the sentence vector, and the similarity Score is calculated as follows:
Score=W 2 (W 1 Concat(sample,μ)+g 1 )+g 2
the neural network is expressed by two linear transformations and offsets, and the parameters are respectively W 1 、g 1 、W 2 、g 2 Where Sample represents the total vector of samples.
CN202211145422.9A 2022-09-20 2022-09-20 Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism Pending CN115510230A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211145422.9A CN115510230A (en) 2022-09-20 2022-09-20 Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211145422.9A CN115510230A (en) 2022-09-20 2022-09-20 Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism

Publications (1)

Publication Number Publication Date
CN115510230A true CN115510230A (en) 2022-12-23

Family

ID=84503522

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211145422.9A Pending CN115510230A (en) 2022-09-20 2022-09-20 Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism

Country Status (1)

Country Link
CN (1) CN115510230A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116187339A (en) * 2023-02-13 2023-05-30 首都师范大学 Automatic composition scoring method based on feature semantic fusion of double-tower model

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116187339A (en) * 2023-02-13 2023-05-30 首都师范大学 Automatic composition scoring method based on feature semantic fusion of double-tower model
CN116187339B (en) * 2023-02-13 2024-03-01 首都师范大学 Automatic composition scoring method based on feature semantic fusion of double-tower model

Similar Documents

Publication Publication Date Title
CN110210037B (en) Syndrome-oriented medical field category detection method
CN106599032B (en) Text event extraction method combining sparse coding and structure sensing machine
CN111382565B (en) Emotion-reason pair extraction method and system based on multiple labels
CN110647612A (en) Visual conversation generation method based on double-visual attention network
CN112115238A (en) Question-answering method and system based on BERT and knowledge base
CN111966812B (en) Automatic question answering method based on dynamic word vector and storage medium
CN111666758B (en) Chinese word segmentation method, training device and computer readable storage medium
CN112541356B (en) Method and system for recognizing biomedical named entities
CN111414481A (en) Chinese semantic matching method based on pinyin and BERT embedding
CN112883714B (en) ABSC task syntactic constraint method based on dependency graph convolution and transfer learning
CN111428490B (en) Reference resolution weak supervised learning method using language model
CN114818717A (en) Chinese named entity recognition method and system fusing vocabulary and syntax information
CN115357719A (en) Power audit text classification method and device based on improved BERT model
CN111145914B (en) Method and device for determining text entity of lung cancer clinical disease seed bank
CN114254645A (en) Artificial intelligence auxiliary writing system
CN114781375A (en) Military equipment relation extraction method based on BERT and attention mechanism
CN115759119A (en) Financial text emotion analysis method, system, medium and equipment
Yan et al. Implicit emotional tendency recognition based on disconnected recurrent neural networks
CN114742069A (en) Code similarity detection method and device
CN112818698B (en) Fine-grained user comment sentiment analysis method based on dual-channel model
CN115510230A (en) Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism
Göker et al. Neural text normalization for turkish social media
CN115204143B (en) Method and system for calculating text similarity based on prompt
CN115169429A (en) Lightweight aspect-level text emotion analysis method
Xu Research on neural network machine translation model based on entity tagging improvement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination