CN109635109B - Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism - Google Patents

Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism Download PDF

Info

Publication number
CN109635109B
CN109635109B CN201811430542.7A CN201811430542A CN109635109B CN 109635109 B CN109635109 B CN 109635109B CN 201811430542 A CN201811430542 A CN 201811430542A CN 109635109 B CN109635109 B CN 109635109B
Authority
CN
China
Prior art keywords
speech
layer
attention
sentence
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811430542.7A
Other languages
Chinese (zh)
Other versions
CN109635109A (en
Inventor
苏锦钿
周炀
朱展东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201811430542.7A priority Critical patent/CN109635109B/en
Publication of CN109635109A publication Critical patent/CN109635109A/en
Application granted granted Critical
Publication of CN109635109B publication Critical patent/CN109635109B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The invention discloses a sentence classification method based on LSTM and combined with part of speech and a multi-attention machine mechanism, which comprises the following steps: converting each sentence into two continuous and dense semantic word vector matrixes and part-of-speech word vector matrixes in an input layer; respectively learning the context information of words or parts of speech in sentences in a shared bidirectional LSTM layer, and outputting the learning results of each step after being connected in series; learning important local features at each position in a sentence from a semantic word vector sequence and a part-of-speech word vector sequence respectively by adopting a self-attention mechanism and a point-by-point function in a self-attention layer to obtain corresponding semantic attention vectors and part-of-speech attention vectors, and constraining the semantic attention vectors and the part-of-speech attention vectors through KL (karhunen-Loeve) distances; performing weighted summation on the output sequence of the bidirectional LSTM layer by using the obtained semantic attention vector and the part of speech attention vector in the merging layer to obtain semantic representation and part of speech representation of a sentence and obtain final semantic representation of the sentence; and finally, performing prediction and classified output through an MLP output layer.

Description

Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism
Technical Field
The invention relates to the field of natural language processing, in particular to a sentence classification method based on LSTM and combined with a part of speech and a multi-attention mechanism.
Background
Sentence classification has been a research hotspot in the field of Natural Language Processing (NLP). In recent years, with the wide application of deep learning in NLP, many scholars successively propose various sentence classification and classification methods based on Long Short-Term memory Model (LSTM), and have achieved better effects than the conventional machine learning methods on many sentence classification corpora such as Stanford Twitter Sentime (STS), stanford sentime Treebank binary classification (SSTb 2), quinary classification (SSTb 5), TREC, IMDB, etc. Compared with the convolutional Neural Network CNN, the LSTM can better describe the context information and long-term dependency of text sequence data, and effectively avoids the problem of gradient disappearance or gradient explosion of a traditional RNN (Current Neural Network) model, so that the LSTM is widely applied to sentence classification tasks.
Currently, in various sentence classification models based on LSTM, a word vector obtained based on large-scale corpus training is mainly used to convert words in a sentence into a distributed representation. The existing research proves that the word vector obtained based on large-scale corpus training contains more comprehensive grammar and semantic information, and the sentence classification effect can be greatly improved. The pre-training word vector commonly used at present is mainly obtained by training with a CBOW or Skip-gram model of word2vec, a GloVe algorithm or a FastText algorithm and the like. These models or algorithms are based primarily on word co-occurrence information within a window (or globally) when training word vectors, and do not contain part-of-speech information for the words themselves. Therefore, the trained word vector only contains information of a content level and does not reflect part-of-speech information of words. In a general text classification task (such as news text classification), feature words have an important indication effect on the classification result, and the feature words mainly comprise nouns or verbs. For example, "typhoon will enter the southeast coast of China" or "China will continue to tax the middle and small enterprises". In the text emotion classification task, the viewpoint words or emotion words for indicating positive or negative emotional tendency are more important, and the words are mainly verbs or adjectives. For example, "i like this part of the movie" or "this movie is too nice looking". Related studies have also shown that adjectives are the main carriers of opinion and emotion. Therefore, the introduction of the part-of-speech information can enrich the feature representation of the sentence better, thereby being beneficial to improving the sentence classification effect. In recent years, some scholars have introduced Attention (Attention) mechanisms in graphic images into NLP and have achieved a series of state-of-the-art effects in many subtasks, such as machine translation, text summarization, relationship extraction, reading comprehension, and text implications. The attention mechanism enables the model to better comprehensively consider different influences of elements in the input source on the target result, and reduces the problem of detail information loss caused by long sentences. Some researchers have proposed a Self-attention (Self-attention) mechanism, also called Intra-attention (Intra-attention), whose main idea is to use the position information of each element in a sentence to calculate a corresponding attention vector and characterize the sentence. Currently, the combination of LSTM with attention (or self-attention) mechanisms has become the core of many models. However, these studies are mainly directed to attention at the content level, and the part-of-speech information of the words is not considered.
Disclosure of Invention
The invention aims to provide a sentence classification method based on LSTM and combined with part of speech and a multi-attention machine mechanism, aiming at the defects of the prior art, the method can not only fully utilize the advantage that a large-scale corpus can provide more accurate grammar and semantic information, but also introduce part of speech information of a sentence to further make up the defect that a pre-training word vector lacks part of speech information, thereby better describing the characteristics of the sentence in the aspects of grammar and semantics.
The purpose of the invention can be realized by the following technical scheme:
a sentence classification method based on LSTM and combined with part of speech and multi-attention mechanism is based on the following five-layer neural network model, the first layer to the fifth layer are respectively an input layer, a shared bidirectional LSTM layer, a self-attention layer, a merging layer and an MLP output layer, and the method specifically comprises the following steps:
after preprocessing the sentences in the input layer, respectively utilizing a pre-training word vector table and a matrix generated based on uniformly distributed random initialization to give mathematical expressions of each word and the part of speech thereof in the sentences, thereby converting each sentence into a semantic word vector matrix and a part of speech word vector matrix;
respectively learning the context information of words or parts of speech in sentences through two LSTM layers in opposite directions in a shared bidirectional LSTM layer, and outputting the learning results of each step after being connected in series;
in the self-attention layer, a self-attention mechanism and a point multiplication function are adopted to learn important local features at each position in a sentence from a semantic word vector sequence and a part-of-speech word vector sequence respectively to obtain corresponding semantic attention vectors and part-of-speech attention vectors, and the semantic attention vectors and the part-of-speech attention vectors are constrained by KL (karhunen-Loeve) distance so as to ensure that the semantic attention vectors and the part-of-speech attention vectors are distributed on each position in the sentence as consistent as possible;
in the merging layer, the semantic attention vector and the part-of-speech attention vector obtained from the attention layer are used for carrying out weighted summation on the output sequence of the bidirectional LSTM layer to obtain semantic representation and part-of-speech representation of the sentence, and then final sentence semantic representation is obtained by comparing weighted average, series connection, summation and maximum value calculation;
and finally, performing prediction and classified output through an MLP output layer comprising a fully-connected hidden layer and a fully-connected softmax layer.
Further, the preprocessing of the sentence in the input layer includes performing word segmentation, illegal character filtering and length completion operations on the sentence.
Furthermore, the number of neurons of the fully connected hidden layer in the MLP output layer is obtained by the sum of the input layer node number and the MLP output layer node number and is the number of categories of the corresponding classification system.
Further, in the training process of the five-layer neural network model, the semantic word vector is kept unchanged, and the part-of-speech word vector is adjusted by using a back propagation algorithm.
Further, to ensure that the KL distance is as small as possible, the KL distance is added to the loss function and serves as one of the objectives of neural network model optimization.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the sentence classification method based on LSTM and combined with the part of speech and the multi-attention mechanism provided by the invention can fully utilize the advantage that a large-scale corpus can provide more accurate grammar and semantic information, and can introduce the part of speech information of the sentence to further make up the defect that the pre-training word vector lacks the part of speech information, thereby better describing the characteristics of the sentence in the aspects of grammar and semantics. The method also comprehensively utilizes the advantages of the LSTM in the aspect of learning context information of words and parts of speech in the sentence and the advantages of an attention mechanism in the aspect of learning important local features of the sentence, the provided classification model has the advantages of high accuracy, strong universality and the like, and good effects are achieved in some famous public corpora including a 20Newsgroup corpus, an IMDB corpus, a Movie Review, a TREC, a Stanford Sentment Treebank (SSTB) and the like.
Drawings
Fig. 1 is a general structure diagram of a five-layer neural network model in the embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Example (b):
the embodiment provides a sentence classification method based on LSTM and combined with part of speech and multi-attention mechanism, which mainly adopts the following steps that on one hand, a pre-training word vector is utilized to give semantic word vector representation of words in a sentence, on the other hand, a part of speech tagging tool is utilized to tag the words in the sentence, and in combination with a simplified part of speech tag set (mainly comprising nouns, verbs, adjectives, adverbs, ending tags UNK and the like), the part of speech is converted into a serial number form, and then mapping and learning are carried out through an embedding layer; secondly, respectively learning the context information of the semantic word vector and the part-of-speech word vector by utilizing a shared bidirectional LSTM, and outputting the forward learning result and the reverse learning result of each time step after being connected in series and combined, thereby respectively obtaining the context relationship of the words and the parts-of-speech; on the basis, a self-attention layer is utilized to learn position information in sentences respectively aiming at semantic word vector sequences and part-of-speech word vector sequences output by an LSTM layer, corresponding attention vectors are constructed, and KL distances are utilized to constrain the attention vectors, so that when the attention weight of the semantic word vectors at a certain position is high, the attention weight of the part-of-speech word vectors is also high, and useful semantic and part-of-speech characteristics for sentence classification are captured better; then, a user-defined merging layer is used for taking two attention vectors obtained from the attention layer and the output of the LSTM as input, weighted averaging is carried out respectively, then summing is carried out to obtain the representation of the sentence in the aspects of semantics and part of speech, and results are merged (various different modes such as weighted smoothing, series connection, summing and maximum value solving are adopted respectively) to obtain the final semantic representation of the sentence; finally, a multi-layer perceptron MLP comprising a fully-connected hidden layer and a softmax output layer is used for prediction and classification output. In the learning process of the model, the pre-training word vectors are kept unchanged, and the part-of-speech word vectors are adjusted by using a back propagation algorithm in the model training process.
The method is based on the following five-layer neural network model, the structure of which is shown in fig. 1, the first layer to the fifth layer are respectively an input layer, a shared bidirectional LSTM layer, a self-attention layer, a merging layer and an MLP output layer, and part of key parameters in the model are shown in table 1:
Figure BDA0001882596130000041
TABLE 1
The first layer of the model firstly preprocesses sentences, mainly comprises punctuation mark filtration, abbreviation filling, space deletion and the like, then determines the length threshold of the sentences by combining the length distribution and the mean square error of the sentences, and performs length filling; then, on one hand, a pre-training word vector table is used for representing semantic vectors of all words in the sentence, on the other hand, NLTK is used for marking parts of speech of all words in the sentence, then parts of speech of the same type are merged and simplified and converted into a sequence number form, then, the parts of speech are randomly and initially set to be word vectors with specified dimensions by utilizing uniform distribution in an interval (-0.25, 0.25), and learning and adjustment are carried out in a model training process through an embedding layer. For each sentence, the input layer can finally obtain a corresponding semantic word vector matrix and a part-of-speech word vector matrix. In the model training process, the semantic word vector is kept unchanged, and the part-of-speech word vector is learned.
The second layer of the model comprises a shared two-way LSTM network. For the semantic word vector matrix and the part-of-speech word vector matrix of the sentence obtained by the input layer, each two-way LSTM learns the upper and lower information of the sentence by utilizing a forward LSTM and a reverse LSTM, and the learning result of each step is output in series, so that a vector containing semantic and context information and a vector containing part-of-speech and context information are finally obtained respectively.
And the third layer of the model comprises a self-attention layer, and the self-attention mechanism and the point multiplication function are adopted to obtain corresponding semantic attention vectors and part-of-speech attention vectors from important local features at each position in the academic sentences of the semantic word vector sequence and the part-of-speech word vector sequence respectively, and the semantic attention vectors and the part-of-speech attention vectors are constrained through KL distances. To ensure that the KL distance is as small as possible, we add the KL distance to the loss function and serve as one of the goals for model optimization.
The fourth layer of the model comprises a self-defined merging layer, the output sequence of the LSTM layer is subjected to weighted summation mainly by using the semantic attention vector and the part-of-speech attention vector obtained from the attention layer to obtain the semantic representation and the part-of-speech representation of the sentence, and then the semantic representation and the part-of-speech representation of the sentence are merged to obtain the final semantic representation of the sentence; in the experimental process, various combination modes such as weighted average, series connection, summation and maximum value solving are comprehensively compared, and the results are analyzed, so that the effect of the weighted average and series connection mode is better than that of the mode of simply solving or obtaining the maximum value.
The fifth layer of the model is a fully-connected hidden layer and a softmax layer aiming at multi-classification logistic regression, and the categories of the sentences are predicted and output by adopting multivariate cross entropy and an rmsprop classifier based on random gradient descent. In the whole model training process, the part-of-speech word vectors in the input layer are adjusted by combining back propagation, and a loss function and a KL distance are optimized.
The above description is only for the preferred embodiments of the present invention, but the protection scope of the present invention is not limited thereto, and any person skilled in the art can substitute or change the technical solution of the present invention and the inventive concept within the scope of the present invention, which is disclosed by the present invention, and the equivalent or change thereof belongs to the protection scope of the present invention.

Claims (5)

1. A sentence classification method based on LSTM and combined with part of speech and multi-attention mechanism is characterized in that the method is based on a five-layer neural network model, wherein the first layer to the fifth layer are respectively an input layer, a shared bidirectional LSTM layer, a self-attention layer, a merging layer and an MLP output layer, and the method specifically comprises the following steps:
after preprocessing the sentences in the input layer, respectively utilizing a pre-training word vector table and a matrix generated based on uniformly distributed random initialization to give mathematical expressions of each word and the part of speech thereof in the sentences, thereby converting each sentence into a semantic word vector matrix and a part of speech word vector matrix;
respectively learning context information of words or parts of speech in sentences through two LSTM layers in opposite directions in a shared bidirectional LSTM layer, and outputting learning results of each step after being connected in series;
in the self-attention layer, a self-attention mechanism and a point multiplication function are adopted to learn important local features at each position in a sentence from a semantic word vector sequence and a part-of-speech word vector sequence respectively to obtain corresponding semantic attention vectors and part-of-speech attention vectors, and the semantic attention vectors and the part-of-speech attention vectors are constrained by KL (karhunen-Loeve) distance so as to ensure that the semantic attention vectors and the part-of-speech attention vectors are distributed on each position in the sentence as consistent as possible;
in the merging layer, the semantic attention vector and the part-of-speech attention vector obtained from the attention layer are used for carrying out weighted summation on the output sequence of the bidirectional LSTM layer to obtain semantic representation and part-of-speech representation of the sentence, and then final sentence semantic representation is obtained by comparing weighted average, series connection, summation and maximum value calculation;
and finally, performing prediction and classified output through an MLP output layer comprising a fully-connected hidden layer and a fully-connected softmax layer.
2. The method of claim 1 for classifying sentences based on LSTM and combined with part-of-speech and multi-attention mechanisms, wherein the method comprises: the preprocessing of the sentence in the input layer comprises the operations of word segmentation, illegal character filtering and length completion of the sentence.
3. The method of claim 1 for classifying sentences based on LSTM and combined with part-of-speech and multi-attention mechanisms, wherein the method comprises: the number of neurons of the fully-connected hidden layer in the MLP output layer is obtained according to the square of the product of the number of nodes of the input layer and the number of nodes of the MLP output layer, and the number of neurons of the fully-connected softmax layer is the number of categories of the corresponding classification system.
4. The method of claim 1 for classifying sentences based on LSTM combined with part-of-speech and multi-attention mechanism, wherein: in the training process of the five-layer neural network model, the semantic word vector is kept unchanged, and the part-of-speech word vector is adjusted by using a back propagation algorithm.
5. The method of claim 1 for classifying sentences based on LSTM and combined with part-of-speech and multi-attention mechanisms, wherein the method comprises: in order to ensure that the KL distance is as small as possible, the KL distance is added to the loss function and serves as one of the targets of neural network model optimization.
CN201811430542.7A 2018-11-28 2018-11-28 Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism Active CN109635109B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811430542.7A CN109635109B (en) 2018-11-28 2018-11-28 Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811430542.7A CN109635109B (en) 2018-11-28 2018-11-28 Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism

Publications (2)

Publication Number Publication Date
CN109635109A CN109635109A (en) 2019-04-16
CN109635109B true CN109635109B (en) 2022-12-16

Family

ID=66069692

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811430542.7A Active CN109635109B (en) 2018-11-28 2018-11-28 Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism

Country Status (1)

Country Link
CN (1) CN109635109B (en)

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110532378B (en) * 2019-05-13 2021-10-26 南京大学 Short text aspect extraction method based on topic model
CN110147452B (en) * 2019-05-17 2022-03-01 北京理工大学 Coarse grain emotion analysis method based on hierarchy BERT neural network
CN110347831A (en) * 2019-06-28 2019-10-18 西安理工大学 Based on the sensibility classification method from attention mechanism
CN110457682B (en) * 2019-07-11 2022-08-09 新华三大数据技术有限公司 Part-of-speech tagging method for electronic medical record, model training method and related device
CN110569499B (en) * 2019-07-18 2021-10-08 中国科学院信息工程研究所 Generating type dialog system coding method and coder based on multi-mode word vectors
CN110427627B (en) * 2019-08-02 2023-04-28 北京百度网讯科技有限公司 Task processing method and device based on semantic representation model
CN110795563A (en) * 2019-10-31 2020-02-14 支付宝(杭州)信息技术有限公司 Text classification model training method, event detection method and corresponding devices
CN110781306B (en) * 2019-10-31 2022-06-28 山东师范大学 English text aspect layer emotion classification method and system
CN110941700B (en) * 2019-11-22 2022-08-09 福州大学 Multi-task joint learning-based argument mining system and working method thereof
CN110929033A (en) * 2019-11-26 2020-03-27 深圳市信联征信有限公司 Long text classification method and device, computer equipment and storage medium
CN111339772B (en) * 2020-03-16 2023-11-14 大连外国语大学 Russian text emotion analysis method, electronic device and storage medium
CN111709230B (en) * 2020-04-30 2023-04-07 昆明理工大学 Short text automatic summarization method based on part-of-speech soft template attention mechanism
CN111581351B (en) * 2020-04-30 2023-05-02 识因智能科技(北京)有限公司 Dynamic element embedding method based on multi-head self-attention mechanism
CN111914085B (en) * 2020-06-18 2024-04-23 华南理工大学 Text fine granularity emotion classification method, system, device and storage medium
CN111737467B (en) * 2020-06-22 2023-05-23 华南师范大学 Object-level emotion classification method based on segmented convolutional neural network
US20220019741A1 (en) * 2020-07-16 2022-01-20 Optum Technology, Inc. An unsupervised approach to assignment of pre-defined labels to text documents
CN112084336A (en) * 2020-09-09 2020-12-15 浙江综合交通大数据中心有限公司 Entity extraction and event classification method and device for expressway emergency
CN112163429B (en) * 2020-09-27 2023-08-29 华南理工大学 Sentence correlation obtaining method, system and medium combining cyclic network and BERT
CN112287689B (en) * 2020-10-27 2022-06-24 山东省计算中心(国家超级计算济南中心) Judicial second-examination case situation auxiliary analysis method and system
CN112487796B (en) * 2020-11-27 2022-02-18 北京智谱华章科技有限公司 Method and device for sequence labeling and electronic equipment
CN112417890B (en) * 2020-11-29 2023-11-24 中国科学院电子学研究所苏州研究院 Fine granularity entity classification method based on diversified semantic attention model
CN112651225B (en) * 2020-12-29 2022-06-14 昆明理工大学 Multi-item selection machine reading understanding method based on multi-stage maximum attention
CN113268565B (en) * 2021-04-27 2022-03-25 山东大学 Method and device for quickly generating word vector based on concept text
CN113535948B (en) * 2021-06-02 2022-08-16 中国人民解放军海军工程大学 LSTM-Attention text classification method introducing essential point information
US11941357B2 (en) 2021-06-23 2024-03-26 Optum Technology, Inc. Machine learning techniques for word-based text similarity determinations
CN114547287B (en) * 2021-11-18 2023-04-07 电子科技大学 Generation type text abstract method
CN114048319B (en) * 2021-11-29 2024-04-23 中国平安人寿保险股份有限公司 Humor text classification method, device, equipment and medium based on attention mechanism
CN114579707B (en) * 2022-03-07 2023-07-28 桂林旅游学院 Aspect-level emotion analysis method based on BERT neural network and multi-semantic learning
CN114492420B (en) * 2022-04-02 2022-07-29 北京中科闻歌科技股份有限公司 Text classification method, device and equipment and computer readable storage medium
CN115906863B (en) * 2022-10-25 2023-09-12 华南师范大学 Emotion analysis method, device, equipment and storage medium based on contrast learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107590138A (en) * 2017-08-18 2018-01-16 浙江大学 A kind of neural machine translation method based on part of speech notice mechanism
CN108446275A (en) * 2018-03-21 2018-08-24 北京理工大学 Long text emotional orientation analytical method based on attention bilayer LSTM
CN108549658A (en) * 2018-03-12 2018-09-18 浙江大学 A kind of deep learning video answering method and system based on the upper attention mechanism of syntactic analysis tree

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11222253B2 (en) * 2016-11-03 2022-01-11 Salesforce.Com, Inc. Deep neural network model for processing data through multiple linguistic task hierarchies
US11205103B2 (en) * 2016-12-09 2021-12-21 The Research Foundation for the State University Semisupervised autoencoder for sentiment analysis
US10733380B2 (en) * 2017-05-15 2020-08-04 Thomson Reuters Enterprise Center Gmbh Neural paraphrase generator

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107590138A (en) * 2017-08-18 2018-01-16 浙江大学 A kind of neural machine translation method based on part of speech notice mechanism
CN108549658A (en) * 2018-03-12 2018-09-18 浙江大学 A kind of deep learning video answering method and system based on the upper attention mechanism of syntactic analysis tree
CN108446275A (en) * 2018-03-21 2018-08-24 北京理工大学 Long text emotional orientation analytical method based on attention bilayer LSTM

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
A STRUCTURED SELF-ATTENTIVE;Zhouhan Lin et al.;《Arxiv》;20170509;第1-15页 *

Also Published As

Publication number Publication date
CN109635109A (en) 2019-04-16

Similar Documents

Publication Publication Date Title
CN109635109B (en) Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism
CN108984745B (en) Neural network text classification method fusing multiple knowledge maps
CN107133213B (en) Method and system for automatically extracting text abstract based on algorithm
CN110765775B (en) Self-adaptive method for named entity recognition field fusing semantics and label differences
CN113128229B (en) Chinese entity relation joint extraction method
CN111783462A (en) Chinese named entity recognition model and method based on dual neural network fusion
CN110263325B (en) Chinese word segmentation system
CN111401061A (en) Method for identifying news opinion involved in case based on BERT and Bi L STM-Attention
CN113312452B (en) Chapter-level text continuity classification method based on multi-task learning
CN113392209B (en) Text clustering method based on artificial intelligence, related equipment and storage medium
CN109299211B (en) Automatic text generation method based on Char-RNN model
CN113673254B (en) Knowledge distillation position detection method based on similarity maintenance
CN112163089B (en) High-technology text classification method and system integrating named entity recognition
CN113255320A (en) Entity relation extraction method and device based on syntax tree and graph attention machine mechanism
CN110874411A (en) Cross-domain emotion classification system based on attention mechanism fusion
CN114881042B (en) Chinese emotion analysis method based on graph-convolution network fusion of syntactic dependency and part of speech
CN112232053A (en) Text similarity calculation system, method and storage medium based on multi-keyword pair matching
Chen et al. Deep neural networks for multi-class sentiment classification
CN114462420A (en) False news detection method based on feature fusion model
CN111914553A (en) Financial information negative subject judgment method based on machine learning
Han et al. An attention-based neural framework for uncertainty identification on social media texts
Verma et al. Semantic similarity between short paragraphs using Deep Learning
Liu et al. Research on advertising content recognition based on convolutional neural network and recurrent neural network
CN115169349A (en) Chinese electronic resume named entity recognition method based on ALBERT
SiChen A neural network based text classification with attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant