CN114297390A - Aspect category identification method and system under long-tail distribution scene - Google Patents

Aspect category identification method and system under long-tail distribution scene Download PDF

Info

Publication number
CN114297390A
CN114297390A CN202111681644.8A CN202111681644A CN114297390A CN 114297390 A CN114297390 A CN 114297390A CN 202111681644 A CN202111681644 A CN 202111681644A CN 114297390 A CN114297390 A CN 114297390A
Authority
CN
China
Prior art keywords
vector
category
embedding
module
equation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111681644.8A
Other languages
Chinese (zh)
Other versions
CN114297390B (en
Inventor
陆恒杨
方伟
聂玮
孙俊
吴小俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangnan University
Original Assignee
Jiangnan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangnan University filed Critical Jiangnan University
Priority to CN202111681644.8A priority Critical patent/CN114297390B/en
Publication of CN114297390A publication Critical patent/CN114297390A/en
Application granted granted Critical
Publication of CN114297390B publication Critical patent/CN114297390B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses an aspect category identification method and system under a long-tail distribution scene, and belongs to the technical field of natural language processing. The method is based on an aspect category identification system under a long-tail distribution scene, the system focuses on the long-tail distribution characteristics of data, firstly obtains fine-grained aspect feature vectors of sentences and provides additional context aspect-level semantic information; then, an attention mechanism fusing context aspect-level semantic information based on long-tail distribution is added, the capability of a model for capturing information most relevant to the aspect categories is enhanced, meanwhile, an improved distribution balance loss function is provided to solve the problems of label co-occurrence and negative category advantages in a long-tail multi-label text classification task, and the aspect category identification effect with the long-tail distribution characteristic is effectively improved.

Description

Aspect category identification method and system under long-tail distribution scene
Technical Field
The invention relates to an aspect category identification method and system under a long tail distribution scene, and belongs to the technical field of natural language processing.
Background
Aspect Category Detection (ACD), one of the important subtasks of Aspect level emotion analysis, aims to detect an Aspect Category contained in a sentence from a set of predefined Aspect categories. Aspect category identification is the fundamental task of whole aspect level sentiment analysis. The emotion analysis has wide application in various fields of life, for example, emotion analysis aiming at opinions of users on various topics expressed in social media, restaurant evaluation, online shopping and the like can help users to have better consumption experience, and can help merchants to know market demands.
However, in actual research, the aspect category distribution often presents the characteristics of unbalanced or even long-tail distribution, so that the model cannot sufficiently extract the features of the tail aspect category, which brings great challenges to the aspect category identification task.
Some existing work addresses this problem with classical machine learning models or deep learning models. For example, Ghadery, E et al (Ghadery, E., et al, MNCN: A Multilingual N gram-Based conditional Network for Aspect Category Detection in Online reviews.2019.33: p.6441-6448.) embed Multilingual words as input to the Network, extract features using a deep Convolutional neural Network, and then learn and identify different facet classes using different fully connected layers, respectively. Hu, M et al (Hu, M., et al, CAN: structured Attention Networks for Multi-Aspect Sentiment analysis.2018.) introduce sparse regularization and orthogonal regularization to compute Attention weights for multiple aspects. This allows the attention weight of multiple aspects to be focused on different parts, while the attention weight of each aspect is focused on only a few words. Movahedi, s. et al (Movahedi, s., et al, Aspect Category Detection view-Attention network.2019.) propose a Topic Attention network model that can detect Aspect categories of a given sentence by paying Attention to different parts of the sentence. Li, Y. et al (Li, Y., et al. Multi-instant Multi-Label Learning Networks for the Aspect-Category analysis. in the processes of the 2020 Conference on electronic Methods in Natural Language Processing (EMNLP).2020.) propose a joint model of Aspect emotion analysis for Multi-Instance Multi-Label Learning, where the attention-based ACD generates significant attention weights for different Aspect classes.
However, the distribution of data collected from an actual scene is often unbalanced, and even presents the feature of a long tail distribution, i.e. a few classes (also called head classes) occupy most of the data, while most of the classes (also called tail classes) have few samples. The above-mentioned prior art methods neglect such a sample number gap when training the model. Too large difference of the numbers of training samples of different classes can make the model unable to achieve good effect on the identification of the class with limited number of samples. The aspect categories are unbalanced, and even the caused long tail distribution can influence the learning process, so that the recognition effect is poor.
Disclosure of Invention
The invention provides an aspect type identification method and system under a long-tail distribution scene, and aims to solve the problem that identification of a model on an aspect type with a limited number of samples cannot achieve a good effect due to the fact that the number difference of training samples of different types is too large caused by long-tail distribution at present.
The first purpose of the invention is to provide an aspect category identification method under a long tail distribution scene, which is characterized in that the method is used for data sets
Figure BDA0003445749100000021
The N sentences of (1) performs aspect category identification, wherein Sl={w1,w2,...,wnIs the ith sentence in the data set D, consisting of n words, wnRepresents the l-th sentence SlThe nth word;
Figure BDA0003445749100000022
is the l-th sentence SlA corresponding aspect category label;
the method comprises the following steps:
step 1: defining m aspect classes in advance, and using A ═ a1,a2,…,amDenotes wherein amTo describe a word or phrase in the mth aspect,
Figure BDA0003445749100000023
step 2: constructing a word embedding matrix E1∈R|V|×dEach word wiEmbedding matrix E by the words1Is mapped as
Figure BDA0003445749100000024
Where | V | is the size of all words in the data set D, and D is the dimension of a word vector;
simultaneous construction of facet class embedding matrices E2∈Rm×dEach word aiEmbedding matrix E by the aspect class2Is mapped as
Figure BDA0003445749100000025
Separately deriving text-embedded vectors
Figure BDA0003445749100000026
Sum aspect embedding vector
Figure BDA0003445749100000027
Figure BDA0003445749100000028
And step 3: embedding the text into a vector
Figure BDA0003445749100000029
Embedding vectors with the aspect
Figure BDA00034457491000000210
Inputting the sentence into a long-time memory network LSTM to obtain the network output hidden state of the sentence
Figure BDA00034457491000000211
Figure BDA00034457491000000212
And
Figure BDA00034457491000000213
and 4, step 4: the hidden state HwAnd HaInputting the data into an IAN-LoT mechanism to obtain a total aspect vector representation s fusing long-tail distribution characteristics;
and 5: inputting the total aspect vector s into an attention mechanism that fuses contextual aspect-level semantic information;
summing the total aspect vector s with
Figure BDA00034457491000000214
As input, a fusion vector is calculated
Figure BDA00034457491000000215
As shown in equation (1):
Figure BDA0003445749100000031
wherein W ∈ Rn×1Is a learnable weight parameter that blends each word with an aspect,
Figure BDA0003445749100000032
vector representing semantic information of the fusion context aspect level, will
Figure BDA0003445749100000033
Input to the attention mechanism, generating an attention weight vector for each predefined facet class; as shown in equation (2), for the jth aspect class:
Figure BDA0003445749100000034
wherein Wj∈Rd×d,bj∈RdAnd uj∈RdFor a learnable parameter, β ∈ R1×mRepresenting the long tail distribution characteristic learned in advance, and is the reciprocal number of effective samples in the training set, alphaj∈RnIs an attention weight vector;
step 6: using vectors
Figure BDA0003445749100000035
As a predicted sentence representation, for the jth aspect class, as shown in equation (3):
Figure BDA0003445749100000036
wherein, Wj∈Rd×1,bjIs a scalar quantity of the input signals,
Figure BDA0003445749100000037
and when the prediction result of the jth aspect category is greater than the classification threshold value, the sentence is considered to contain the jth aspect category.
Optionally, the step of calculating a total aspect vector of the fused long-tail distribution features in the IAN-LoT mechanism includes:
step 41: hidden state for input HwAnd HaCalculating an interaction attention weight matrix I e Rn×mAs shown in equation (4):
Figure BDA0003445749100000038
step 42: performing softmax calculations for each row of the interaction attention weight matrix, as shown in equation (5):
Figure BDA0003445749100000039
wherein k isijFor the matrix k ∈ Rn×mI row and j column of (a), k represents the attention weight of the text to the aspect, IijIs the element of ith row and jth column in the matrix I;
step 43: and then introducing a data long tail distribution characteristic for the matrix k, as shown in formula (6):
Figure BDA00034457491000000310
wherein the content of the first and second substances,
Figure BDA00034457491000000311
weight information for each aspect for text introducing long tail distribution, beta epsilon R1×mRepresenting the long tail distribution characteristics learned in advance, and being the reciprocal number of the effective samples in the training set, wherein m is the number of the aspect categories;
step 44: for the
Figure BDA00034457491000000312
Performing maximum pooling to obtain fine-grained text-to-aspect weight information I blended with long-tail distribution characteristicsLFurther, the weight information and the embedded vector representation of the aspect category are expressed
Figure BDA00034457491000000313
The multiplication results in the final overall aspect vector representation s, as shown in equation (7):
Figure BDA0003445749100000041
wherein s ∈ R1×d
Optionally, the method trains the recognition model by using an improved a-DB loss function, where the improved a-DB loss function improves a calculation method of the rebalancing weight and a smoothing function, and specifically includes:
first, without considering tag co-occurrence,
Figure BDA0003445749100000042
representing the number of samples containing the jth aspect category in the data set; the expected value of the sampling frequency for the jth aspect class is
Figure BDA0003445749100000043
The sample sampling frequency P is then estimated for each of the positive classes of repeated samples contained in the exampleIAs shown in equation (8):
Figure BDA0003445749100000044
wherein the content of the first and second substances,
Figure BDA0003445749100000045
when in use
Figure BDA00034457491000000414
Indicating that the ith sentence contains the jth aspect category aj
Figure BDA0003445749100000046
If not, then not included;
weight balancing weights
Figure BDA0003445749100000047
The calculation is shown in equation (9):
Figure BDA0003445749100000048
wherein gamma is a coordination weight hyperparameter;
the smoothing function is to
Figure BDA0003445749100000049
The formula of the mapping is:
Figure BDA00034457491000000410
in order to avoid the over-suppression of a few classes caused by the advantages of negative labels, a negative class suppression hyper-parameter lambda and a specific bias tau are introducedjAs shown in formula (11):
Figure BDA00034457491000000411
where ρ isjIs the ratio of the jth category to the total number of samples, and eta is a proportion hyper-parameter;
the A-DB loss function is shown in equation (12):
Figure BDA00034457491000000412
wherein the content of the first and second substances,
Figure BDA00034457491000000413
a probability value output for the network.
Optionally, the classification threshold is 0.5.
A second object of the present invention is to provide an aspect category identification system in a long tail distribution scenario, wherein the system includes: the system comprises an input module, a text embedding module, an LSTM module, an IAN-LOT module, a fusion module, an attention mechanism module and a prediction module;
the input module, the text embedding module, the LSTM module, the IAN-LOT module, the fusion module, the attention mechanism module and the prediction module are sequentially connected;
the input module is used for inputting a predefined aspect category combination and a text to be recognized; the text aspect embedding module is used for constructing a word embedding matrix and an aspect category embedding matrix, and mapping an input predefined aspect category combination and a text to be recognized to a text embedding vector and an aspect embedding vector; the LSTM module is used for outputting the hidden states of the text embedding vector and the aspect embedding vector; the IAN-LOT module is used for obtaining a total aspect vector fusing long tail distribution characteristics according to the hidden state; the fusion module is used for fusing context level semantic information and generating a fusion vector; the attention mechanism module generates an attention weight vector for each predefined aspect category according to the fusion vector; the prediction module is used for completing classification and prediction of aspect class identification according to the attention weight vector.
Optionally, the system pairs the data sets
Figure BDA0003445749100000051
The work process of recognizing the aspect category of the N sentences comprises the following steps:
wherein S isl={w1,w2,…,wnIs the ith sentence in the data set D, consisting of n words, wiRepresents the l-th sentence SlThe ith word;
Figure BDA0003445749100000052
is the l-th sentence SlA corresponding aspect category label;
step 1: defining m aspect classes in advance, and using A ═ a1,a2,…,amDenotes wherein amTo describe a word or phrase in the mth aspect,
Figure BDA0003445749100000053
step 2: constructing a word embedding matrix E1∈R|V|×dEach word wiEmbedding matrix E by the words1Is mapped as
Figure BDA0003445749100000054
Where | V | is the size of all words in the data set D, and D is the dimension of a word vector;
simultaneous construction of facet class embedding matrices E2∈Rm×dEach word aiEmbedding matrix E by the aspect class2Is mapped as
Figure BDA0003445749100000055
Separately deriving text-embedded vectors
Figure BDA0003445749100000056
Sum aspect embedding vector
Figure BDA0003445749100000057
Figure BDA0003445749100000058
And step 3: embedding the text into a vector
Figure BDA0003445749100000059
Embedding vectors with the aspect
Figure BDA00034457491000000510
Inputting the sentence into a long-time memory network LSTM to obtain the network output hidden state of the sentence
Figure BDA00034457491000000511
Figure BDA00034457491000000512
And
Figure BDA00034457491000000513
and 4, step 4: the hidden state HwAnd HaInputting the data into an IAN-LoT mechanism to obtain a total aspect vector representation s fusing long-tail distribution characteristics;
and 5: inputting the total aspect vector s into an attention mechanism that fuses contextual aspect-level semantic information;
summing the total aspect vector s with
Figure BDA0003445749100000061
As input, a fusion vector is calculated
Figure BDA0003445749100000062
As shown in equation (1):
Figure BDA0003445749100000063
wherein W ∈ Rn×1Is a learnable weight parameter that blends each word with an aspect,
Figure BDA0003445749100000064
vector representing semantic information of the fusion context aspect level, will
Figure BDA0003445749100000065
Input to the attention mechanism, generating an attention weight vector for each predefined facet class; as shown in equation (2), for the jth aspect class:
Figure BDA0003445749100000066
wherein Wj∈Rd×d,bj∈RdAnd uj∈RdFor a learnable parameter, β ∈ R1×mRepresenting the long tail distribution characteristic learned in advance, and is the reciprocal number of effective samples in the training set, alphaj∈RnIs an attention weight vector;
step 6: using vectors
Figure BDA0003445749100000067
As a predicted sentence representation, for the jth aspect class, as shown in equation (3):
Figure BDA0003445749100000068
wherein, Wj∈Rd×1,bjIs a scalar quantity of the input signals,
Figure BDA0003445749100000069
and when the prediction result of the jth aspect category is greater than the classification threshold value, the sentence is considered to contain the jth aspect category.
Optionally, the step of calculating a total aspect vector of the fused long-tail distribution features in the IAN-LoT mechanism includes:
step 41: hidden state for input HwAnd HaCalculating an interaction attention weight matrix I e Rn×mAs shown in equation (4):
Figure BDA00034457491000000610
step 42: performing softmax calculations for each row of the interaction attention weight matrix, as shown in equation (5):
Figure BDA00034457491000000611
wherein k isijFor the matrix k ∈ Rn×mI row and j column of (a), k represents the attention weight of the text to the aspect, IijIs the element of ith row and jth column in the matrix I;
step 43: and then introducing a data long tail distribution characteristic for the matrix k, as shown in formula (6):
Figure BDA00034457491000000612
wherein the content of the first and second substances,
Figure BDA00034457491000000613
weight information for each aspect for text introducing long tail distribution, beta epsilon R1×mRepresenting the long tail distribution characteristics learned in advance, and being the reciprocal number of the effective samples in the training set, wherein m is the number of the aspect categories;
step 44: for the
Figure BDA00034457491000000614
Performing maximum pooling to obtain fine-grained text-to-aspect weight information I blended with long-tail distribution characteristicsLFurther, the weight information and the embedded vector representation of the aspect category are expressed
Figure BDA00034457491000000615
The multiplication results in the final overall aspect vector representation s, as shown in equation (7):
Figure BDA0003445749100000071
wherein s ∈ R1×d
Optionally, the system trains the recognition model by using an improved a-DB loss function, where the improved a-DB loss function improves a calculation method of the rebalancing weight and a smoothing function, and specifically includes:
first, without considering tag co-occurrence,
Figure BDA0003445749100000072
representing the number of samples containing the jth aspect category in the data set; the expected value of the sampling frequency for the jth aspect class is
Figure BDA0003445749100000073
The sample sampling frequency P is then estimated for each of the positive classes of repeated samples contained in the exampleIAs shown in equation (8):
Figure BDA0003445749100000074
wherein the content of the first and second substances,
Figure BDA0003445749100000075
when in use
Figure BDA0003445749100000076
Indicating that the ith sentence contains the jth aspect category aj
Figure BDA0003445749100000077
If not, then not included;
weight balancing weights
Figure BDA0003445749100000078
The calculation is shown in equation (9):
Figure BDA0003445749100000079
wherein gamma is a coordination weight hyperparameter;
the smoothing function is to
Figure BDA00034457491000000710
The formula of the mapping is:
Figure BDA00034457491000000711
in order to avoid the over-suppression of a few classes caused by the advantages of negative labels, a negative class suppression hyper-parameter lambda and a specific bias tau are introducedjAs shown in formula (11):
Figure BDA00034457491000000712
where ρ isjIs the ratio of the jth category to the total number of samples, and eta is a proportion hyper-parameter;
the A-DB loss function is shown in equation (12):
Figure BDA00034457491000000713
wherein the content of the first and second substances,
Figure BDA00034457491000000714
a probability value output for the network.
The invention has the beneficial effects that:
aiming at the aspect category identification problem which is characterized by long-tail distribution data, the invention models the aspect category identification into a multi-label classification problem and provides an aspect category identification model based on the fusion aspect vector of long-tail distribution;
1) an A-DB loss function suitable for the multi-label long-tail distribution problem is introduced to train a recognition model, so that the method can be effectively suitable for data with the long-tail distribution characteristic, and the aspect category recognition effect with the long-tail distribution characteristic is improved;
2) an interactive attention module with the characteristic of data long-tail distribution, namely an IAN-LoT mechanism is introduced, and the mechanism introduces the characteristic of data long-tail distribution, can obtain feature vectors in the aspect of fine granularity of sentences, provides additional context aspect-level semantic information, and can effectively improve the recognition effect of tail categories;
3) after the fine-grained aspect feature vectors are obtained, the invention also provides an attention mechanism for fusing the context aspect-level semantic information, the aspect vectors and the context information are fused, the relevant correct information of the aspect can be focused, the capability of capturing the most relevant information of the aspect category by the model is enhanced, and the model is more effective on the aspect category identification task of long-tail distribution.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a model architecture diagram of the present invention.
Fig. 2 is a flow chart of the method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
The first embodiment is as follows:
the embodiment provides an aspect category identification method under a long tail distribution scene, and the method is used for identifying the category of a data set
Figure BDA0003445749100000081
Figure BDA0003445749100000082
The N sentences of (1) performs aspect category identification, wherein Sl={w1,w2,…,wnIs the ith sentence in the data set D, consisting of n words, wnRepresents the l-th sentence SlThe nth word;
Figure BDA0003445749100000083
is the l-th sentence SlA corresponding aspect category label;
the method comprises the following steps:
step 1: defining m aspect classes in advance, and using A ═ a1,a2,…,amDenotes wherein amTo describe a word or phrase in the mth aspect,
Figure BDA0003445749100000091
step 2: constructing a word embedding matrix E1∈R|V|×dEach word wiEmbedding matrix E by the words1Is mapped as
Figure BDA0003445749100000092
Where | V | is the size of all words in the data set D, and D is the dimension of a word vector;
simultaneous construction of facet class embedding matrices E2∈Rm×dEach word aiEmbedding matrix E by the aspect class2Is mapped as
Figure BDA0003445749100000093
Separately deriving text-embedded vectors
Figure BDA0003445749100000094
Sum aspect embedding vector
Figure BDA0003445749100000095
Figure BDA0003445749100000096
And step 3: embedding the text into a vector
Figure BDA0003445749100000097
Embedding vectors with the aspect
Figure BDA0003445749100000098
Inputting the sentence into a long-time memory network LSTM to obtain the network output hidden state of the sentence
Figure BDA0003445749100000099
Figure BDA00034457491000000910
And
Figure BDA00034457491000000911
and 4, step 4: the hidden state HwAnd HaInputting the data into an IAN-LoT mechanism to obtain a total aspect vector representation s fusing long-tail distribution characteristics;
and 5: inputting the total aspect vector s into an attention mechanism that fuses contextual aspect-level semantic information;
summing the total aspect vector s with
Figure BDA00034457491000000912
As input, a fusion vector is calculated
Figure BDA00034457491000000913
As shown in equation (1):
Figure BDA00034457491000000914
wherein W ∈ Rn×1Is a learnable weight parameter that blends each word with an aspect,
Figure BDA00034457491000000915
vector representing semantic information of the fusion context aspect level, will
Figure BDA00034457491000000916
Input to the attention mechanism, generating an attention weight vector for each predefined facet class; as shown in equation (2), for the jth aspect class:
Figure BDA00034457491000000917
wherein Wj∈Rd×d,bj∈RdAnd uj∈RdFor a learnable parameter, β ∈ R1×mRepresenting the long tail distribution characteristic learned in advance, and is the reciprocal number of effective samples in the training set, alphaj∈RnIs an attention weight vector;
step 6: using vectors
Figure BDA00034457491000000918
As a predicted sentence representation, for the jth aspect class, as shown in equation (3):
Figure BDA00034457491000000919
wherein, Wj∈Rd×1,bjIs a scalar quantity of the input signals,
Figure BDA00034457491000000920
and when the prediction result of the jth aspect category is greater than the classification threshold value, the sentence is considered to contain the jth aspect category.
Example two:
the method of the embodiment introduces an interactive attention module with the characteristic of data long-tail distribution, namely an IAN-LoT mechanism, and can obtain fine-grained aspect feature vectors of sentences and provide additional context aspect-level semantic information. After obtaining the fine-grained aspect feature vectors, the model also adds an attention mechanism based on long-tail distribution and fusing context-level semantic information, enhances the capability of the model to capture information most relevant to aspect categories, and is trained by using an improved multi-label classification loss function suitable for long-tail distribution.
This embodiment is for data sets
Figure BDA0003445749100000101
The N sentences of (1) performs aspect category identification, wherein Sl={w1,w2,…,wnIs the ith sentence in the data set D, consisting of n words, wnRepresents the l-th sentence SlThe nth word;
Figure BDA0003445749100000102
is the l-th sentence SlA corresponding aspect category label;
the method comprises the following steps:
step 1: defining m aspect classes in advance, and using A ═ a1,a2,…,amDenotes wherein amTo describe a word or phrase in the mth aspect,
Figure BDA0003445749100000103
step 2: constructing a word embedding matrix E1∈R|V|×dEach word wiEmbedding matrix E by the words1Is mapped as
Figure BDA0003445749100000104
Where | V | is the size of all words in the data set D, and D is the dimension of a word vector;
simultaneous construction of facet class embedding matrices E2∈Rm×dEach word aiEmbedding matrix E by the aspect class2Is mapped as
Figure BDA0003445749100000105
Separately deriving text-embedded vectors
Figure BDA0003445749100000106
Sum aspect embedding vector
Figure BDA0003445749100000107
Figure BDA0003445749100000108
And step 3: embedding the text into a vector
Figure BDA0003445749100000109
Embedding vectors with the aspect
Figure BDA00034457491000001010
Inputting the sentence into a long-time memory network LSTM to obtain the network output hidden state of the sentence
Figure BDA00034457491000001011
Figure BDA00034457491000001012
And
Figure BDA00034457491000001013
and 4, step 4: the hidden state HwAnd HaInputting the data into an IAN-LoT mechanism to obtain a total aspect vector representation s fusing long-tail distribution characteristics;
step 41: hidden state for input HwAnd HaCalculating an interaction attention weight matrix I e Rn×mAs shown in equation (1):
Figure BDA00034457491000001014
step 42: performing softmax calculations for each row of the interaction attention weight matrix, as shown in equation (2):
Figure BDA00034457491000001015
wherein k isijFor the matrix k ∈ Rn×mI row and j column of (a), k represents the attention weight of the text to the aspect, IijIs the element of ith row and jth column in the matrix I;
step 43: and then introducing a data long tail distribution characteristic for the matrix k, as shown in formula (3):
Figure BDA0003445749100000111
wherein the content of the first and second substances,
Figure BDA0003445749100000112
weight information for each aspect for text introducing long tail distribution, beta epsilon R1×mRepresenting the long tail distribution characteristics learned in advance, and being the reciprocal number of the effective samples in the training set, wherein m is the number of the aspect categories;
step 44: for the
Figure BDA0003445749100000113
Performing maximum pooling to obtain fine-grained text-to-aspect weight information I blended with long-tail distribution characteristicsLFurther, the weight information and the embedded vector representation of the aspect category are expressed
Figure BDA0003445749100000114
The multiplication results in the final overall aspect vector representation s, as shown in equation (4):
Figure BDA0003445749100000115
wherein s ∈ R1×d
And 5: inputting the total aspect vector s into an attention mechanism that fuses contextual aspect-level semantic information;
summing the total aspect vector s with
Figure BDA0003445749100000116
As input, a fusion vector is calculated
Figure BDA0003445749100000117
As shown in equation (5):
Figure BDA0003445749100000118
wherein W ∈ Rn×1Is a learnable weight parameter that blends each word with an aspect,
Figure BDA0003445749100000119
vector representing semantic information of the fusion context aspect level, will
Figure BDA00034457491000001110
Input to the attention mechanism, generating an attention weight vector for each predefined facet class; as shown in equation (6), for the jth aspect class:
Figure BDA00034457491000001111
wherein Wj∈Rd×d,bj∈RdAnd uj∈RdFor a learnable parameter, β ∈ R1×mRepresenting the long tail distribution characteristic learned in advance, and is the reciprocal number of effective samples in the training set, alphaj∈RnIs an attention weight vector;
step 6: using vectors
Figure BDA00034457491000001112
As a predicted sentence representation, for the jth aspect class, as shown in equation (7):
Figure BDA00034457491000001113
wherein, Wj∈Rd×1,bjIs a scalar quantity of the input signals,
Figure BDA00034457491000001114
for the prediction result of the jth aspect category, when the prediction result is greater than the classification threshold value of 0.5, the sentence is considered to contain the jth aspect category.
The method of this embodiment trains the recognition model by using an improved a-DB loss function, which improves the calculation method of the rebalancing weight and the smoothing function, and specifically includes:
first, without considering tag co-occurrence,
Figure BDA00034457491000001115
representing the number of samples containing the jth aspect category in the data set; the expected value of the sampling frequency for the jth aspect class is
Figure BDA00034457491000001116
The sample sampling frequency P is then estimated for each of the positive classes of repeated samples contained in the exampleIAs shown in equation (8):
Figure BDA0003445749100000121
wherein the content of the first and second substances,
Figure BDA0003445749100000122
when in use
Figure BDA0003445749100000123
Indicating that the ith sentence contains the jth partyClass a of surfacej
Figure BDA0003445749100000124
If not, then not included;
weight balancing weights
Figure BDA0003445749100000125
The calculation is shown in equation (9):
Figure BDA0003445749100000126
wherein gamma is a coordination weight hyperparameter;
the smoothing function is to
Figure BDA0003445749100000127
The formula of the mapping is:
Figure BDA0003445749100000128
in order to avoid the over-suppression of a few classes caused by the advantages of negative labels, a negative class suppression hyper-parameter lambda and a specific bias tau are introducedjAs shown in formula (11):
Figure BDA0003445749100000129
where ρ isjIs the ratio of the jth category to the total number of samples, and eta is a proportion hyper-parameter;
the A-DB loss function is shown in equation (12):
Figure BDA00034457491000001210
wherein the content of the first and second substances,
Figure BDA00034457491000001211
a probability value output for the network.
Example three:
the embodiment provides an aspect category identification system under a long tail distribution scene, the system comprising: the system comprises an input module, a text embedding module, an LSTM module, an IAN-LOT module, a fusion module, an attention mechanism module and a prediction module;
the input module, the text embedding module, the LSTM module, the IAN-LOT module, the fusion module, the attention mechanism module and the prediction module are sequentially connected;
the input module is used for inputting a predefined aspect category combination and a text to be recognized; the text aspect embedding module is used for constructing a word embedding matrix and an aspect category embedding matrix, and mapping an input predefined aspect category combination and a text to be recognized to a text embedding vector and an aspect embedding vector; the LSTM module is used for outputting the hidden states of the text embedding vector and the aspect embedding vector; the IAN-LOT module is used for obtaining a total aspect vector fusing long tail distribution characteristics according to the hidden state; the fusion module is used for fusing context level semantic information and generating a fusion vector; the attention mechanism module generates an attention weight vector for each predefined aspect category according to the fusion vector; the prediction module is used for completing classification and prediction of aspect class identification according to the attention weight vector.
For example, in a restaurant review, "When we sat down, the waiter barly spoken in the outer direction and abrupply overlapped outer menu on the table," we will classify this sentence and a predefined set of aspects: food, staff, miscellaneous, place, service, menu, price, and ambiance are input into the model shown in fig. 1, and a set of results can be predicted by using the model, wherein 1 is that the sentence includes the corresponding aspect type, and 0 is that the sentence does not include the corresponding aspect type, as shown in table 1:
Category food staff miscellaneous place service menu price ambience
label 0 1 0 0 0 1 0 0
to further illustrate the beneficial effects that the present invention can achieve, the following experiments were performed:
the present invention uses 6 baseline methods for comparison:
(1) aspect category identification model:
TextCNN [34 ]: the method for classifying the text by utilizing the convolutional neural network is a more basic model;
LSTM [24 ]: training by using an LSTM network, and classifying by taking the last hidden state as a final form;
SVR [35 ]: combining word vectors of a sentence into a vector as input, and classifying by using a machine learning classifier;
SCAN [36 ]: an aspect category emotion analysis method based on a sentence component perception network.
(2) The ACD task and ACSA task combined training model comprises the following steps:
AS-Capsules [37 ]: a method for performing aspect category sentiment analysis by sharing components using correlation between aspect categories and sentiments;
AC-MIMLLN [6 ]: a joint model of multi-instance multi-label learning aspect emotion analysis is presented, where an attention-based ACD generates valid attention weights for different aspect categories. We compare for such models using their predicted results of ACD.
Table 2 shows a comparison of the results of the Macro F1 experiments on the ACD task by the method of the present invention and other methods of comparison on MAMS-LT and SemEval2014-LT data sets.
Table 2 data comparison results
Figure BDA0003445749100000131
Figure BDA0003445749100000141
In order to better research the problem of Long-tail Distribution, the invention refers to a method for creating a Multi-Label image Dataset satisfying the Long-tail Distribution by referring to (Wu, T.et al., Distribution-Balanced Loss for Multi-Label Classification in Long-labeled Datasets.2020: Computer Vision-ECCV 2020.), and an existing SemEval-2014 Task 4(Pontiki, M.et al., SemEval-Task 4: assembled basic analysis.2014.) and a MAMS Dataset (Jiang, Q.A. Challege Dataset and influence for batch analysis in Long-tail Distribution of Distribution 2019. meeting the requirements of the Emulation-topic analysis of the Long-tail Distribution and the load of the national Distribution of the origin of the image Dataset (local, quality, A. Challege Dataset and influence) are respectively created to conform to the emotion Distribution of the Long-Tail Distribution of the local Distribution of the origin and the origin of the local Distribution (local Distribution). Table 1 shows the Macro F1 scores of the baseline method and the method of the invention in the MAMS-LT data set and the SemEval2014-LT data set, and the higher the score is, the better the classification effect is.
From the experimental results, we can conclude the following:
firstly, the method is superior to all baseline methods in the MAMS-LT data set and the SemEval2014-LT data set, and the method has better aspect class detection capability in the data set with the long tail distribution characteristic.
Secondly, compared with the best scoring baseline method AS-Capsules, the method of the present invention is 2.28% and 1.92% higher on both datasets, respectively, indicating that the Macro F1 score of the method of the present invention is a distinct advantage on the MAMS-LT dataset, demonstrating that the method of the present invention has a better effect on aspect category detection for sentences containing multiple aspects.
Thirdly, the reason that the effect of the method of the present invention on the SemEval2014-LT data set is not as good as that of the MAMS-LT data set may be that most of the sentences of the former data set only contain one aspect category, which may weaken the effect of the rebalancing weights designed for the tag co-occurrence problem, but it can be seen that the effect is significantly improved on the MAMS-LT data sets with each sentence containing two or more aspects.
Table 3 shows the Macro F1 score experimental results for each aspect of AS-Capsules and the method of the present invention in the SemEval2014-LT dataset.
Table 3 comparison of AS-Capsules on SemEval2014-LT dataset with the results of the method of the invention
Figure BDA0003445749100000142
Figure BDA0003445749100000151
The SemEval2014-LT data set constructed by the method comprises 1, 2 and 2 classes of a header class, a middle class and a tail class respectively. The results show that:
first, for the tail classes such as price and ambiance, the Macro F1 scores were 5.99% and 7.88% higher, respectively, which indicates that the method of the present invention has a significantly improved detection effect on the aspect classes of the tail classes. The prediction result of the tailing class 'price' is even better than that of the head class 'food', and the prediction of the AS-Capsules on the tailing class 'price' is reduced by 9.83% compared with that of the head class 'food'. This therefore also proves the effectiveness of the work of the long tail distribution proposed by the present invention from an experimental point of view.
Secondly, for a head class 'food', the method of the invention is lower than the AS-Capsules of a comparison method, and possible reasons are two, one is that the model of the invention is more in processing the tail class, and more weight is distributed to the tail class when the fine-grained aspect is fused, so more information of the tail class is concerned when an attention mechanism is used; secondly, the use of weight balancing weights: in the improved loss function, the weight of the head classes is reduced, and the suppression effect is more obvious when the number of the head classes is larger, so that the prediction effect of the head classes can be weakened to a certain extent.
Some steps in the embodiments of the present invention may be implemented by software, and the corresponding software program may be stored in a readable storage medium, such as an optical disc or a hard disk.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. An aspect category identification method under a long tail distribution scene is characterized in that the method identifies a data set
Figure FDA0003445749090000011
Figure FDA0003445749090000012
The N sentences of (1) performs aspect category identification, wherein Sl={w1,w2,…,wnIs the ith sentence in the data set D, consisting of n words, wnRepresents the l-th sentence SlThe nth word;
Figure FDA0003445749090000013
is the l-th sentence SlA corresponding aspect category label;
the method comprises the following steps:
step 1: defining m aspect classes in advance, and using A ═ a1,a2,…,amDenotes wherein amTo describe a word or phrase in the mth aspect,
Figure FDA0003445749090000014
step 2: constructing a word embedding matrix E1∈R|V|×dEach word wiEmbedding matrix E by the words1Is mapped as
Figure FDA0003445749090000015
Where | V | is the size of all words in the data set D, and D is the dimension of a word vector;
simultaneous construction of facet class embedding matrices E2∈Rm×dEach word aiEmbedding matrix E by the aspect class2Is mapped as
Figure FDA0003445749090000016
Separately deriving text-embedded vectors
Figure FDA0003445749090000017
Sum aspect embedding vector
Figure FDA0003445749090000018
Figure FDA0003445749090000019
And step 3: embedding the text into a vector
Figure FDA00034457490900000110
Embedding vectors with the aspect
Figure FDA00034457490900000111
Inputting the sentence into a long-time memory network LSTM to obtain the network output hidden state of the sentence
Figure FDA00034457490900000112
Figure FDA00034457490900000113
And
Figure FDA00034457490900000114
and 4, step 4: the hidden state HwAnd HaInputting the data into an IAN-LoT mechanism to obtain a total aspect vector representation s fusing long-tail distribution characteristics;
and 5: inputting the total aspect vector s into an attention mechanism for fusing context aspect-level semantic information, and calculating a fusion vector
Figure FDA00034457490900000115
Step 6: using the fused vector
Figure FDA00034457490900000116
As a predicted sentence representation, for the jth aspect class, as shown in equation (1):
Figure FDA00034457490900000117
wherein, Wj∈Rd×1,bjIs a scalar quantity of the input signals,
Figure FDA00034457490900000118
and when the prediction result of the jth aspect category is greater than the classification threshold value, the sentence is considered to contain the jth aspect category.
2. The method of claim 1, wherein the step of calculating the total aspect vector of the fused long tail distribution features in the IAN-LoT mechanism comprises:
step 41: hidden state for input HwAnd HaCalculating an interaction attention weight matrix I e Rn×mAs shown in equation (2):
Figure FDA00034457490900000119
step 42: performing softmax calculations for each row of the interaction attention weight matrix, as shown in equation (3):
Figure FDA0003445749090000021
wherein k isijFor the matrix k ∈ Rn×mI row and j column of (a), k represents the attention weight of the text to the aspect, IijIs the element of ith row and jth column in the matrix I;
step 43: and then introducing a data long tail distribution characteristic for the matrix k, as shown in formula (4):
Figure FDA0003445749090000022
wherein the content of the first and second substances,
Figure FDA0003445749090000023
weight information for each aspect for text introducing long tail distribution, beta epsilon R1×mRepresenting the long tail distribution characteristics learned in advance, and being the reciprocal number of the effective samples in the training set, wherein m is the number of the aspect categories;
step 44: for the
Figure FDA00034457490900000214
Performing maximum pooling to obtain fine-grained text-to-aspect weight information I blended with long-tail distribution characteristicsLFurther, the weight information and the embedded vector representation of the aspect category are expressed
Figure FDA0003445749090000024
The multiplication results in the final overall aspect vector representation s, as shown in equation (5):
Figure FDA0003445749090000025
wherein s ∈ R1×d
3. The method of claim 2, wherein the fused vector of fused context aspect-level semantic information
Figure FDA0003445749090000026
The calculation process of (2) includes:
summing the total aspect vector s with
Figure FDA0003445749090000027
As input, a fusion vector is calculated
Figure FDA0003445749090000028
As shown in equation (6):
Figure FDA0003445749090000029
wherein W ∈ Rn×1Is a learnable weight parameter that blends each word with an aspect,
Figure FDA00034457490900000210
vector representing semantic information of the fusion context aspect level, will
Figure FDA00034457490900000211
Input to the attention mechanism, generating an attention weight vector for each predefined facet class; as shown in equation (7), for the jth aspect class:
Figure FDA00034457490900000212
wherein Wj∈Rd×d,bj∈RdAnd uj∈RdFor a learnable parameter, β ∈ R1×mRepresenting the long tail distribution characteristic learned in advance, and is the reciprocal number of effective samples in the training set, alphaj∈RnIs the attention weight vector.
4. The method of claim 3, wherein the method trains the recognition model using an improved A-DB loss function that improves the way the rebalancing weights are computed and the smoothing function, and comprises:
first, without considering tag co-occurrence,
Figure FDA00034457490900000213
representing the number of samples containing the jth aspect category in the data set; the expected value of the sampling frequency for the jth aspect class is
Figure FDA0003445749090000031
The sample sampling frequency P is then estimated for each of the positive classes of repeated samples contained in the exampleIAs shown in equation (8):
Figure FDA0003445749090000032
wherein the content of the first and second substances,
Figure FDA0003445749090000033
when in use
Figure FDA0003445749090000034
Indicating that the ith sentence contains the jth aspect category aj
Figure FDA0003445749090000035
If not, then not included;
weight balancing weights
Figure FDA0003445749090000036
The calculation is shown in equation (9):
Figure FDA0003445749090000037
wherein gamma is a coordination weight hyperparameter;
the smoothing function is to
Figure FDA0003445749090000038
The formula of the mapping is:
Figure FDA0003445749090000039
in order to avoid the over-suppression of a few classes caused by the advantages of negative labels, a negative class suppression hyper-parameter lambda and a specific bias tau are introducedjAs shown in formula (11):
Figure FDA00034457490900000310
where ρ isjIs the ratio of the jth category to the total number of samples, and eta is a proportion hyper-parameter;
the A-DB loss function is shown in equation (12):
Figure FDA00034457490900000311
wherein the content of the first and second substances,
Figure FDA00034457490900000312
a probability value output for the network.
5. The method of claim 1, wherein the classification threshold is 0.5.
6. An aspect category identification system under a long tail distribution scene, the system comprising: the system comprises an input module, a text embedding module, an LSTM module, an IAN-LOT module, a fusion module, an attention mechanism module and a prediction module;
the input module, the text embedding module, the LSTM module, the IAN-LOT module, the fusion module, the attention mechanism module and the prediction module are sequentially connected;
the input module is used for inputting a predefined aspect category combination and a text to be recognized; the text aspect embedding module is used for constructing a word embedding matrix and an aspect category embedding matrix, and mapping an input predefined aspect category combination and a text to be recognized to a text embedding vector and an aspect embedding vector; the LSTM module is used for outputting the hidden states of the text embedding vector and the aspect embedding vector; the IAN-LOT module is used for obtaining a total aspect vector fusing long tail distribution characteristics according to the hidden state; the fusion module is used for fusing context level semantic information and generating a fusion vector; the attention mechanism module generates an attention weight vector for each predefined aspect category according to the fusion vector; the prediction module is used for completing classification and prediction of aspect class identification according to the attention weight vector.
7. The system of claim 6, wherein the system is to a data set
Figure FDA0003445749090000041
Figure FDA0003445749090000042
The work process of recognizing the aspect category of the N sentences comprises the following steps:
wherein S isl={w1,w2,…,wnIs the ith sentence in the data set D, consisting of n words, wiRepresents the l-th sentence SlThe ith word;
Figure FDA0003445749090000043
is the l-th sentence SlA corresponding aspect category label;
step 1: defining m aspect classes in advance, and using A ═ a1,a2,…,amDenotes wherein amTo describe a word or phrase in the mth aspect,
Figure FDA0003445749090000044
step 2: constructing a word embedding matrix E1∈R|V|×dEach word wiEmbedding matrix E by the words1Is mapped as
Figure FDA0003445749090000045
Where | V | is the size of all words in the data set D, and D is the dimension of a word vector;
simultaneous construction of facet class embedding matrices E2∈Rm×dEach word aiEmbedding matrix E by the aspect class2Is mapped as
Figure FDA0003445749090000046
Separately deriving text-embedded vectors
Figure FDA0003445749090000047
Sum aspect embedding vector
Figure FDA0003445749090000048
Figure FDA0003445749090000049
And step 3: embedding the text into a vector
Figure FDA00034457490900000410
Embedding vectors with the aspect
Figure FDA00034457490900000411
Inputting the sentence into a long-time memory network LSTM to obtain the network output hidden state of the sentence
Figure FDA00034457490900000412
Figure FDA00034457490900000413
And
Figure FDA00034457490900000414
and 4, step 4: the hidden state HwAnd HaInputting the data into an IAN-LoT mechanism to obtain a total aspect vector representation s fusing long-tail distribution characteristics;
and 5: inputting the total aspect vector s into an attention mechanism for fusing context aspect-level semantic information, and calculating a fusion vector
Figure FDA00034457490900000415
Step 6: using vectors
Figure FDA00034457490900000416
As a predicted sentence representation, for the jth aspect class, as shown in equation (1):
Figure FDA00034457490900000417
wherein, Wj∈Rd×1,bjIs a scalar quantity of the input signals,
Figure FDA00034457490900000418
and when the prediction result of the jth aspect category is greater than the classification threshold value, the sentence is considered to contain the jth aspect category.
8. The system of claim 7, wherein the step of calculating the total aspect vector of the fused long tail distribution features in the IAN-LoT mechanism comprises:
step 41: hidden state for input HwAnd HaCalculating an interaction attention weight matrix I e Rn×mAs shown in equation (2):
Figure FDA0003445749090000051
step 42: performing softmax calculations for each row of the interaction attention weight matrix, as shown in equation (3):
Figure FDA0003445749090000052
wherein k isijFor the matrix k ∈ Rn×mI row and j column of (a), k represents the attention weight of the text to the aspect, IijIs the element of ith row and jth column in the matrix I;
step 43: and then introducing a data long tail distribution characteristic for the matrix k, as shown in formula (4):
Figure FDA0003445749090000053
wherein the content of the first and second substances,
Figure FDA0003445749090000054
weight information for each aspect for text introducing long tail distribution, beta epsilon R1×mRepresenting the long tail distribution characteristics learned in advance, and being the reciprocal number of the effective samples in the training set, wherein m is the number of the aspect categories;
step 44: for the
Figure FDA0003445749090000055
Performing maximum pooling to obtain fine-grained text-to-aspect weight information I blended with long-tail distribution characteristicsLFurther, the weight information and the embedded vector representation of the aspect category are expressed
Figure FDA0003445749090000056
The multiplication results in the final overall aspect vector representation s, as shown in equation (5):
Figure FDA0003445749090000057
wherein s ∈ R1×d
9. The system of claim 8, wherein the vector representation of fused contextual aspect-level semantic information
Figure FDA0003445749090000058
The calculation process of (2) includes:
summing the total aspect vector s with
Figure FDA0003445749090000059
As input, a fusion vector is calculated
Figure FDA00034457490900000510
As shown in equation (6):
Figure FDA00034457490900000511
wherein W ∈ Rn×1Is a learnable weight parameter that blends each word with an aspect,
Figure FDA00034457490900000512
vector representing semantic information of the fusion context aspect level, will
Figure FDA00034457490900000513
Input to the attention mechanism, generating an attention weight vector for each predefined facet class; as shown in equation (7), for the jth aspect class:
Figure FDA00034457490900000514
wherein Wj∈Rd×d,bj∈RdAnd uj∈RdFor a learnable parameter, β ∈ R1×mRepresenting the long tail distribution characteristic learned in advance, and is the reciprocal number of effective samples in the training set, alphaj∈RnIs the attention weight vector.
10. The system of claim 9, wherein the system trains the recognition model using an improved a-DB loss function that improves the way rebalancing weights are computed and the smoothing function, and further comprising:
first, without considering tag co-occurrence,
Figure FDA0003445749090000061
representing the number of samples containing the jth aspect category in the data set; the expected value of the sampling frequency for the jth aspect class is
Figure FDA0003445749090000062
The sample sampling frequency P is then estimated for each of the positive classes of repeated samples contained in the exampleIAs shown in equation (8):
Figure FDA0003445749090000063
wherein the content of the first and second substances,
Figure FDA0003445749090000064
when in use
Figure FDA0003445749090000065
Indicating that the ith sentence contains the jth aspect category aj
Figure FDA0003445749090000066
If not, then not included;
weight balancing weights
Figure FDA0003445749090000067
The calculation is shown in equation (9):
Figure FDA0003445749090000068
wherein gamma is a coordination weight hyperparameter;
the smoothing function is to
Figure FDA0003445749090000069
The formula of the mapping is:
Figure FDA00034457490900000610
in order to avoid the over-suppression of a few classes caused by the advantages of negative labels, a negative class suppression hyper-parameter lambda and a specific bias tau are introducedjAs shown in formula (11):
Figure FDA00034457490900000611
where ρ isjIs the ratio of the jth category to the total number of samples, and eta is a proportion hyper-parameter;
the A-DB loss function is shown in equation (12):
Figure FDA00034457490900000612
wherein the content of the first and second substances,
Figure FDA00034457490900000613
a probability value output for the network.
CN202111681644.8A 2021-12-30 2021-12-30 Aspect category identification method and system in long tail distribution scene Active CN114297390B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111681644.8A CN114297390B (en) 2021-12-30 2021-12-30 Aspect category identification method and system in long tail distribution scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111681644.8A CN114297390B (en) 2021-12-30 2021-12-30 Aspect category identification method and system in long tail distribution scene

Publications (2)

Publication Number Publication Date
CN114297390A true CN114297390A (en) 2022-04-08
CN114297390B CN114297390B (en) 2024-04-02

Family

ID=80975589

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111681644.8A Active CN114297390B (en) 2021-12-30 2021-12-30 Aspect category identification method and system in long tail distribution scene

Country Status (1)

Country Link
CN (1) CN114297390B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115563284A (en) * 2022-10-24 2023-01-03 重庆理工大学 Deep multi-instance weak supervision text classification method based on semantics

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190143415A (en) * 2018-06-20 2019-12-30 강원대학교산학협력단 Method of High-Performance Machine Reading Comprehension through Feature Selection
CN111581981A (en) * 2020-05-06 2020-08-25 西安交通大学 Evaluation object strengthening and constraint label embedding based aspect category detection system and method
CN112199504A (en) * 2020-10-30 2021-01-08 福州大学 Visual angle level text emotion classification method and system integrating external knowledge and interactive attention mechanism
CN112686056A (en) * 2021-03-22 2021-04-20 华南师范大学 Emotion classification method
CN113222059A (en) * 2021-05-28 2021-08-06 北京理工大学 Multi-label emotion classification method using cooperative neural network chain
WO2021164199A1 (en) * 2020-02-20 2021-08-26 齐鲁工业大学 Multi-granularity fusion model-based intelligent semantic chinese sentence matching method, and device
US11194972B1 (en) * 2021-02-19 2021-12-07 Institute Of Automation, Chinese Academy Of Sciences Semantic sentiment analysis method fusing in-depth features and time sequence models

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190143415A (en) * 2018-06-20 2019-12-30 강원대학교산학협력단 Method of High-Performance Machine Reading Comprehension through Feature Selection
WO2021164199A1 (en) * 2020-02-20 2021-08-26 齐鲁工业大学 Multi-granularity fusion model-based intelligent semantic chinese sentence matching method, and device
CN111581981A (en) * 2020-05-06 2020-08-25 西安交通大学 Evaluation object strengthening and constraint label embedding based aspect category detection system and method
CN112199504A (en) * 2020-10-30 2021-01-08 福州大学 Visual angle level text emotion classification method and system integrating external knowledge and interactive attention mechanism
US11194972B1 (en) * 2021-02-19 2021-12-07 Institute Of Automation, Chinese Academy Of Sciences Semantic sentiment analysis method fusing in-depth features and time sequence models
CN112686056A (en) * 2021-03-22 2021-04-20 华南师范大学 Emotion classification method
CN113222059A (en) * 2021-05-28 2021-08-06 北京理工大学 Multi-label emotion classification method using cooperative neural network chain

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
王家乾;龚子寒;薛云;庞士冠;古东宏;: "基于混合多头注意力和胶囊网络的特定目标情感分析", 中文信息学报, no. 05, 15 May 2020 (2020-05-15) *
邓立明;魏晶晶;吴运兵;余小燕;廖祥文;: "基于知识图谱与循环注意力网络的视角级情感分析", 模式识别与人工智能, no. 06, 15 June 2020 (2020-06-15) *
黄露;周恩国;李岱峰;: "融合特定任务信息注意力机制的文本表示学习模型", 数据分析与知识发现, no. 09, 31 December 2020 (2020-12-31) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115563284A (en) * 2022-10-24 2023-01-03 重庆理工大学 Deep multi-instance weak supervision text classification method based on semantics

Also Published As

Publication number Publication date
CN114297390B (en) 2024-04-02

Similar Documents

Publication Publication Date Title
Ishaq et al. Aspect-based sentiment analysis using a hybridized approach based on CNN and GA
CN112084335B (en) Social media user account classification method based on information fusion
Ye et al. Advise: Symbolism and external knowledge for decoding advertisements
Jain et al. Sentiment classification of twitter data belonging to renewable energy using machine learning
CN105279495A (en) Video description method based on deep learning and text summarization
CN109101490B (en) Factual implicit emotion recognition method and system based on fusion feature representation
CN112861541A (en) Commodity comment sentiment analysis method based on multi-feature fusion
CN112256866A (en) Text fine-grained emotion analysis method based on deep learning
CN112307336A (en) Hotspot information mining and previewing method and device, computer equipment and storage medium
Ara et al. Understanding customer sentiment: Lexical analysis of restaurant reviews
Gandhi et al. Multimodal sentiment analysis: review, application domains and future directions
Nareshkumar et al. Interactive deep neural network for aspect-level sentiment analysis
Fouladfar et al. Predicting the helpfulness score of product reviews using an evidential score fusion method
CN114297390A (en) Aspect category identification method and system under long-tail distribution scene
JP4054046B2 (en) Opinion determination database creation method and apparatus and program, opinion determination method and apparatus and program, and computer-readable recording medium
CN107291686B (en) Method and system for identifying emotion identification
CN109902174B (en) Emotion polarity detection method based on aspect-dependent memory network
Rajput et al. Analysis of various sentiment analysis techniques
Nagpal et al. Effective approach for sentiment analysis of food delivery apps
CN114840665A (en) Rumor detection method and device based on emotion analysis and related medium
Sungsri et al. The analysis and summarizing system of thai hotel reviews using opinion mining technique
CN111340329A (en) Actor assessment method and device and electronic equipment
Tuah et al. Sentiment Analysis of Political Party News on the Online News Portal Detik. com Using LSTM and CNN
Li et al. Deep recommendation based on dual attention mechanism
Velammal Development of knowledge based sentiment analysis system using lexicon approach on twitter data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant