CN109635109B - Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism - Google Patents
Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism Download PDFInfo
- Publication number
- CN109635109B CN109635109B CN201811430542.7A CN201811430542A CN109635109B CN 109635109 B CN109635109 B CN 109635109B CN 201811430542 A CN201811430542 A CN 201811430542A CN 109635109 B CN109635109 B CN 109635109B
- Authority
- CN
- China
- Prior art keywords
- speech
- layer
- attention
- sentence
- semantic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/253—Grammatical analysis; Style critique
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Abstract
The invention discloses a sentence classification method based on LSTM and combined with part of speech and a multi-attention machine mechanism, which comprises the following steps: converting each sentence into two continuous and dense semantic word vector matrixes and part-of-speech word vector matrixes in an input layer; respectively learning the context information of words or parts of speech in sentences in a shared bidirectional LSTM layer, and outputting the learning results of each step after being connected in series; learning important local features at each position in a sentence from a semantic word vector sequence and a part-of-speech word vector sequence respectively by adopting a self-attention mechanism and a point-by-point function in a self-attention layer to obtain corresponding semantic attention vectors and part-of-speech attention vectors, and constraining the semantic attention vectors and the part-of-speech attention vectors through KL (karhunen-Loeve) distances; performing weighted summation on the output sequence of the bidirectional LSTM layer by using the obtained semantic attention vector and the part of speech attention vector in the merging layer to obtain semantic representation and part of speech representation of a sentence and obtain final semantic representation of the sentence; and finally, performing prediction and classified output through an MLP output layer.
Description
Technical Field
The invention relates to the field of natural language processing, in particular to a sentence classification method based on LSTM and combined with a part of speech and a multi-attention mechanism.
Background
Sentence classification has been a research hotspot in the field of Natural Language Processing (NLP). In recent years, with the wide application of deep learning in NLP, many scholars successively propose various sentence classification and classification methods based on Long Short-Term memory Model (LSTM), and have achieved better effects than the conventional machine learning methods on many sentence classification corpora such as Stanford Twitter Sentime (STS), stanford sentime Treebank binary classification (SSTb 2), quinary classification (SSTb 5), TREC, IMDB, etc. Compared with the convolutional Neural Network CNN, the LSTM can better describe the context information and long-term dependency of text sequence data, and effectively avoids the problem of gradient disappearance or gradient explosion of a traditional RNN (Current Neural Network) model, so that the LSTM is widely applied to sentence classification tasks.
Currently, in various sentence classification models based on LSTM, a word vector obtained based on large-scale corpus training is mainly used to convert words in a sentence into a distributed representation. The existing research proves that the word vector obtained based on large-scale corpus training contains more comprehensive grammar and semantic information, and the sentence classification effect can be greatly improved. The pre-training word vector commonly used at present is mainly obtained by training with a CBOW or Skip-gram model of word2vec, a GloVe algorithm or a FastText algorithm and the like. These models or algorithms are based primarily on word co-occurrence information within a window (or globally) when training word vectors, and do not contain part-of-speech information for the words themselves. Therefore, the trained word vector only contains information of a content level and does not reflect part-of-speech information of words. In a general text classification task (such as news text classification), feature words have an important indication effect on the classification result, and the feature words mainly comprise nouns or verbs. For example, "typhoon will enter the southeast coast of China" or "China will continue to tax the middle and small enterprises". In the text emotion classification task, the viewpoint words or emotion words for indicating positive or negative emotional tendency are more important, and the words are mainly verbs or adjectives. For example, "i like this part of the movie" or "this movie is too nice looking". Related studies have also shown that adjectives are the main carriers of opinion and emotion. Therefore, the introduction of the part-of-speech information can enrich the feature representation of the sentence better, thereby being beneficial to improving the sentence classification effect. In recent years, some scholars have introduced Attention (Attention) mechanisms in graphic images into NLP and have achieved a series of state-of-the-art effects in many subtasks, such as machine translation, text summarization, relationship extraction, reading comprehension, and text implications. The attention mechanism enables the model to better comprehensively consider different influences of elements in the input source on the target result, and reduces the problem of detail information loss caused by long sentences. Some researchers have proposed a Self-attention (Self-attention) mechanism, also called Intra-attention (Intra-attention), whose main idea is to use the position information of each element in a sentence to calculate a corresponding attention vector and characterize the sentence. Currently, the combination of LSTM with attention (or self-attention) mechanisms has become the core of many models. However, these studies are mainly directed to attention at the content level, and the part-of-speech information of the words is not considered.
Disclosure of Invention
The invention aims to provide a sentence classification method based on LSTM and combined with part of speech and a multi-attention machine mechanism, aiming at the defects of the prior art, the method can not only fully utilize the advantage that a large-scale corpus can provide more accurate grammar and semantic information, but also introduce part of speech information of a sentence to further make up the defect that a pre-training word vector lacks part of speech information, thereby better describing the characteristics of the sentence in the aspects of grammar and semantics.
The purpose of the invention can be realized by the following technical scheme:
a sentence classification method based on LSTM and combined with part of speech and multi-attention mechanism is based on the following five-layer neural network model, the first layer to the fifth layer are respectively an input layer, a shared bidirectional LSTM layer, a self-attention layer, a merging layer and an MLP output layer, and the method specifically comprises the following steps:
after preprocessing the sentences in the input layer, respectively utilizing a pre-training word vector table and a matrix generated based on uniformly distributed random initialization to give mathematical expressions of each word and the part of speech thereof in the sentences, thereby converting each sentence into a semantic word vector matrix and a part of speech word vector matrix;
respectively learning the context information of words or parts of speech in sentences through two LSTM layers in opposite directions in a shared bidirectional LSTM layer, and outputting the learning results of each step after being connected in series;
in the self-attention layer, a self-attention mechanism and a point multiplication function are adopted to learn important local features at each position in a sentence from a semantic word vector sequence and a part-of-speech word vector sequence respectively to obtain corresponding semantic attention vectors and part-of-speech attention vectors, and the semantic attention vectors and the part-of-speech attention vectors are constrained by KL (karhunen-Loeve) distance so as to ensure that the semantic attention vectors and the part-of-speech attention vectors are distributed on each position in the sentence as consistent as possible;
in the merging layer, the semantic attention vector and the part-of-speech attention vector obtained from the attention layer are used for carrying out weighted summation on the output sequence of the bidirectional LSTM layer to obtain semantic representation and part-of-speech representation of the sentence, and then final sentence semantic representation is obtained by comparing weighted average, series connection, summation and maximum value calculation;
and finally, performing prediction and classified output through an MLP output layer comprising a fully-connected hidden layer and a fully-connected softmax layer.
Further, the preprocessing of the sentence in the input layer includes performing word segmentation, illegal character filtering and length completion operations on the sentence.
Furthermore, the number of neurons of the fully connected hidden layer in the MLP output layer is obtained by the sum of the input layer node number and the MLP output layer node number and is the number of categories of the corresponding classification system.
Further, in the training process of the five-layer neural network model, the semantic word vector is kept unchanged, and the part-of-speech word vector is adjusted by using a back propagation algorithm.
Further, to ensure that the KL distance is as small as possible, the KL distance is added to the loss function and serves as one of the objectives of neural network model optimization.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the sentence classification method based on LSTM and combined with the part of speech and the multi-attention mechanism provided by the invention can fully utilize the advantage that a large-scale corpus can provide more accurate grammar and semantic information, and can introduce the part of speech information of the sentence to further make up the defect that the pre-training word vector lacks the part of speech information, thereby better describing the characteristics of the sentence in the aspects of grammar and semantics. The method also comprehensively utilizes the advantages of the LSTM in the aspect of learning context information of words and parts of speech in the sentence and the advantages of an attention mechanism in the aspect of learning important local features of the sentence, the provided classification model has the advantages of high accuracy, strong universality and the like, and good effects are achieved in some famous public corpora including a 20Newsgroup corpus, an IMDB corpus, a Movie Review, a TREC, a Stanford Sentment Treebank (SSTB) and the like.
Drawings
Fig. 1 is a general structure diagram of a five-layer neural network model in the embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Example (b):
the embodiment provides a sentence classification method based on LSTM and combined with part of speech and multi-attention mechanism, which mainly adopts the following steps that on one hand, a pre-training word vector is utilized to give semantic word vector representation of words in a sentence, on the other hand, a part of speech tagging tool is utilized to tag the words in the sentence, and in combination with a simplified part of speech tag set (mainly comprising nouns, verbs, adjectives, adverbs, ending tags UNK and the like), the part of speech is converted into a serial number form, and then mapping and learning are carried out through an embedding layer; secondly, respectively learning the context information of the semantic word vector and the part-of-speech word vector by utilizing a shared bidirectional LSTM, and outputting the forward learning result and the reverse learning result of each time step after being connected in series and combined, thereby respectively obtaining the context relationship of the words and the parts-of-speech; on the basis, a self-attention layer is utilized to learn position information in sentences respectively aiming at semantic word vector sequences and part-of-speech word vector sequences output by an LSTM layer, corresponding attention vectors are constructed, and KL distances are utilized to constrain the attention vectors, so that when the attention weight of the semantic word vectors at a certain position is high, the attention weight of the part-of-speech word vectors is also high, and useful semantic and part-of-speech characteristics for sentence classification are captured better; then, a user-defined merging layer is used for taking two attention vectors obtained from the attention layer and the output of the LSTM as input, weighted averaging is carried out respectively, then summing is carried out to obtain the representation of the sentence in the aspects of semantics and part of speech, and results are merged (various different modes such as weighted smoothing, series connection, summing and maximum value solving are adopted respectively) to obtain the final semantic representation of the sentence; finally, a multi-layer perceptron MLP comprising a fully-connected hidden layer and a softmax output layer is used for prediction and classification output. In the learning process of the model, the pre-training word vectors are kept unchanged, and the part-of-speech word vectors are adjusted by using a back propagation algorithm in the model training process.
The method is based on the following five-layer neural network model, the structure of which is shown in fig. 1, the first layer to the fifth layer are respectively an input layer, a shared bidirectional LSTM layer, a self-attention layer, a merging layer and an MLP output layer, and part of key parameters in the model are shown in table 1:
TABLE 1
The first layer of the model firstly preprocesses sentences, mainly comprises punctuation mark filtration, abbreviation filling, space deletion and the like, then determines the length threshold of the sentences by combining the length distribution and the mean square error of the sentences, and performs length filling; then, on one hand, a pre-training word vector table is used for representing semantic vectors of all words in the sentence, on the other hand, NLTK is used for marking parts of speech of all words in the sentence, then parts of speech of the same type are merged and simplified and converted into a sequence number form, then, the parts of speech are randomly and initially set to be word vectors with specified dimensions by utilizing uniform distribution in an interval (-0.25, 0.25), and learning and adjustment are carried out in a model training process through an embedding layer. For each sentence, the input layer can finally obtain a corresponding semantic word vector matrix and a part-of-speech word vector matrix. In the model training process, the semantic word vector is kept unchanged, and the part-of-speech word vector is learned.
The second layer of the model comprises a shared two-way LSTM network. For the semantic word vector matrix and the part-of-speech word vector matrix of the sentence obtained by the input layer, each two-way LSTM learns the upper and lower information of the sentence by utilizing a forward LSTM and a reverse LSTM, and the learning result of each step is output in series, so that a vector containing semantic and context information and a vector containing part-of-speech and context information are finally obtained respectively.
And the third layer of the model comprises a self-attention layer, and the self-attention mechanism and the point multiplication function are adopted to obtain corresponding semantic attention vectors and part-of-speech attention vectors from important local features at each position in the academic sentences of the semantic word vector sequence and the part-of-speech word vector sequence respectively, and the semantic attention vectors and the part-of-speech attention vectors are constrained through KL distances. To ensure that the KL distance is as small as possible, we add the KL distance to the loss function and serve as one of the goals for model optimization.
The fourth layer of the model comprises a self-defined merging layer, the output sequence of the LSTM layer is subjected to weighted summation mainly by using the semantic attention vector and the part-of-speech attention vector obtained from the attention layer to obtain the semantic representation and the part-of-speech representation of the sentence, and then the semantic representation and the part-of-speech representation of the sentence are merged to obtain the final semantic representation of the sentence; in the experimental process, various combination modes such as weighted average, series connection, summation and maximum value solving are comprehensively compared, and the results are analyzed, so that the effect of the weighted average and series connection mode is better than that of the mode of simply solving or obtaining the maximum value.
The fifth layer of the model is a fully-connected hidden layer and a softmax layer aiming at multi-classification logistic regression, and the categories of the sentences are predicted and output by adopting multivariate cross entropy and an rmsprop classifier based on random gradient descent. In the whole model training process, the part-of-speech word vectors in the input layer are adjusted by combining back propagation, and a loss function and a KL distance are optimized.
The above description is only for the preferred embodiments of the present invention, but the protection scope of the present invention is not limited thereto, and any person skilled in the art can substitute or change the technical solution of the present invention and the inventive concept within the scope of the present invention, which is disclosed by the present invention, and the equivalent or change thereof belongs to the protection scope of the present invention.
Claims (5)
1. A sentence classification method based on LSTM and combined with part of speech and multi-attention mechanism is characterized in that the method is based on a five-layer neural network model, wherein the first layer to the fifth layer are respectively an input layer, a shared bidirectional LSTM layer, a self-attention layer, a merging layer and an MLP output layer, and the method specifically comprises the following steps:
after preprocessing the sentences in the input layer, respectively utilizing a pre-training word vector table and a matrix generated based on uniformly distributed random initialization to give mathematical expressions of each word and the part of speech thereof in the sentences, thereby converting each sentence into a semantic word vector matrix and a part of speech word vector matrix;
respectively learning context information of words or parts of speech in sentences through two LSTM layers in opposite directions in a shared bidirectional LSTM layer, and outputting learning results of each step after being connected in series;
in the self-attention layer, a self-attention mechanism and a point multiplication function are adopted to learn important local features at each position in a sentence from a semantic word vector sequence and a part-of-speech word vector sequence respectively to obtain corresponding semantic attention vectors and part-of-speech attention vectors, and the semantic attention vectors and the part-of-speech attention vectors are constrained by KL (karhunen-Loeve) distance so as to ensure that the semantic attention vectors and the part-of-speech attention vectors are distributed on each position in the sentence as consistent as possible;
in the merging layer, the semantic attention vector and the part-of-speech attention vector obtained from the attention layer are used for carrying out weighted summation on the output sequence of the bidirectional LSTM layer to obtain semantic representation and part-of-speech representation of the sentence, and then final sentence semantic representation is obtained by comparing weighted average, series connection, summation and maximum value calculation;
and finally, performing prediction and classified output through an MLP output layer comprising a fully-connected hidden layer and a fully-connected softmax layer.
2. The method of claim 1 for classifying sentences based on LSTM and combined with part-of-speech and multi-attention mechanisms, wherein the method comprises: the preprocessing of the sentence in the input layer comprises the operations of word segmentation, illegal character filtering and length completion of the sentence.
3. The method of claim 1 for classifying sentences based on LSTM and combined with part-of-speech and multi-attention mechanisms, wherein the method comprises: the number of neurons of the fully-connected hidden layer in the MLP output layer is obtained according to the square of the product of the number of nodes of the input layer and the number of nodes of the MLP output layer, and the number of neurons of the fully-connected softmax layer is the number of categories of the corresponding classification system.
4. The method of claim 1 for classifying sentences based on LSTM combined with part-of-speech and multi-attention mechanism, wherein: in the training process of the five-layer neural network model, the semantic word vector is kept unchanged, and the part-of-speech word vector is adjusted by using a back propagation algorithm.
5. The method of claim 1 for classifying sentences based on LSTM and combined with part-of-speech and multi-attention mechanisms, wherein the method comprises: in order to ensure that the KL distance is as small as possible, the KL distance is added to the loss function and serves as one of the targets of neural network model optimization.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811430542.7A CN109635109B (en) | 2018-11-28 | 2018-11-28 | Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811430542.7A CN109635109B (en) | 2018-11-28 | 2018-11-28 | Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109635109A CN109635109A (en) | 2019-04-16 |
CN109635109B true CN109635109B (en) | 2022-12-16 |
Family
ID=66069692
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811430542.7A Active CN109635109B (en) | 2018-11-28 | 2018-11-28 | Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109635109B (en) |
Families Citing this family (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110532378B (en) * | 2019-05-13 | 2021-10-26 | 南京大学 | Short text aspect extraction method based on topic model |
CN110147452B (en) * | 2019-05-17 | 2022-03-01 | 北京理工大学 | Coarse grain emotion analysis method based on hierarchy BERT neural network |
CN110347831A (en) * | 2019-06-28 | 2019-10-18 | 西安理工大学 | Based on the sensibility classification method from attention mechanism |
CN110457682B (en) * | 2019-07-11 | 2022-08-09 | 新华三大数据技术有限公司 | Part-of-speech tagging method for electronic medical record, model training method and related device |
CN110569499B (en) * | 2019-07-18 | 2021-10-08 | 中国科学院信息工程研究所 | Generating type dialog system coding method and coder based on multi-mode word vectors |
CN110427627B (en) * | 2019-08-02 | 2023-04-28 | 北京百度网讯科技有限公司 | Task processing method and device based on semantic representation model |
CN110795563A (en) * | 2019-10-31 | 2020-02-14 | 支付宝(杭州)信息技术有限公司 | Text classification model training method, event detection method and corresponding devices |
CN110781306B (en) * | 2019-10-31 | 2022-06-28 | 山东师范大学 | English text aspect layer emotion classification method and system |
CN110941700B (en) * | 2019-11-22 | 2022-08-09 | 福州大学 | Multi-task joint learning-based argument mining system and working method thereof |
CN110929033A (en) * | 2019-11-26 | 2020-03-27 | 深圳市信联征信有限公司 | Long text classification method and device, computer equipment and storage medium |
CN111339772B (en) * | 2020-03-16 | 2023-11-14 | 大连外国语大学 | Russian text emotion analysis method, electronic device and storage medium |
CN111709230B (en) * | 2020-04-30 | 2023-04-07 | 昆明理工大学 | Short text automatic summarization method based on part-of-speech soft template attention mechanism |
CN111581351B (en) * | 2020-04-30 | 2023-05-02 | 识因智能科技(北京)有限公司 | Dynamic element embedding method based on multi-head self-attention mechanism |
CN111914085B (en) * | 2020-06-18 | 2024-04-23 | 华南理工大学 | Text fine granularity emotion classification method, system, device and storage medium |
CN111737467B (en) * | 2020-06-22 | 2023-05-23 | 华南师范大学 | Object-level emotion classification method based on segmented convolutional neural network |
US20220019741A1 (en) * | 2020-07-16 | 2022-01-20 | Optum Technology, Inc. | An unsupervised approach to assignment of pre-defined labels to text documents |
CN112084336A (en) * | 2020-09-09 | 2020-12-15 | 浙江综合交通大数据中心有限公司 | Entity extraction and event classification method and device for expressway emergency |
CN112163429B (en) * | 2020-09-27 | 2023-08-29 | 华南理工大学 | Sentence correlation obtaining method, system and medium combining cyclic network and BERT |
CN112287689B (en) * | 2020-10-27 | 2022-06-24 | 山东省计算中心(国家超级计算济南中心) | Judicial second-examination case situation auxiliary analysis method and system |
CN112487796B (en) * | 2020-11-27 | 2022-02-18 | 北京智谱华章科技有限公司 | Method and device for sequence labeling and electronic equipment |
CN112417890B (en) * | 2020-11-29 | 2023-11-24 | 中国科学院电子学研究所苏州研究院 | Fine granularity entity classification method based on diversified semantic attention model |
CN112651225B (en) * | 2020-12-29 | 2022-06-14 | 昆明理工大学 | Multi-item selection machine reading understanding method based on multi-stage maximum attention |
CN113268565B (en) * | 2021-04-27 | 2022-03-25 | 山东大学 | Method and device for quickly generating word vector based on concept text |
CN113535948B (en) * | 2021-06-02 | 2022-08-16 | 中国人民解放军海军工程大学 | LSTM-Attention text classification method introducing essential point information |
US11941357B2 (en) | 2021-06-23 | 2024-03-26 | Optum Technology, Inc. | Machine learning techniques for word-based text similarity determinations |
CN114547287B (en) * | 2021-11-18 | 2023-04-07 | 电子科技大学 | Generation type text abstract method |
CN114048319B (en) * | 2021-11-29 | 2024-04-23 | 中国平安人寿保险股份有限公司 | Humor text classification method, device, equipment and medium based on attention mechanism |
CN114579707B (en) * | 2022-03-07 | 2023-07-28 | 桂林旅游学院 | Aspect-level emotion analysis method based on BERT neural network and multi-semantic learning |
CN114492420B (en) * | 2022-04-02 | 2022-07-29 | 北京中科闻歌科技股份有限公司 | Text classification method, device and equipment and computer readable storage medium |
CN115906863B (en) * | 2022-10-25 | 2023-09-12 | 华南师范大学 | Emotion analysis method, device, equipment and storage medium based on contrast learning |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107590138A (en) * | 2017-08-18 | 2018-01-16 | 浙江大学 | A kind of neural machine translation method based on part of speech notice mechanism |
CN108446275A (en) * | 2018-03-21 | 2018-08-24 | 北京理工大学 | Long text emotional orientation analytical method based on attention bilayer LSTM |
CN108549658A (en) * | 2018-03-12 | 2018-09-18 | 浙江大学 | A kind of deep learning video answering method and system based on the upper attention mechanism of syntactic analysis tree |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11222253B2 (en) * | 2016-11-03 | 2022-01-11 | Salesforce.Com, Inc. | Deep neural network model for processing data through multiple linguistic task hierarchies |
US11205103B2 (en) * | 2016-12-09 | 2021-12-21 | The Research Foundation for the State University | Semisupervised autoencoder for sentiment analysis |
US10733380B2 (en) * | 2017-05-15 | 2020-08-04 | Thomson Reuters Enterprise Center Gmbh | Neural paraphrase generator |
-
2018
- 2018-11-28 CN CN201811430542.7A patent/CN109635109B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107590138A (en) * | 2017-08-18 | 2018-01-16 | 浙江大学 | A kind of neural machine translation method based on part of speech notice mechanism |
CN108549658A (en) * | 2018-03-12 | 2018-09-18 | 浙江大学 | A kind of deep learning video answering method and system based on the upper attention mechanism of syntactic analysis tree |
CN108446275A (en) * | 2018-03-21 | 2018-08-24 | 北京理工大学 | Long text emotional orientation analytical method based on attention bilayer LSTM |
Non-Patent Citations (1)
Title |
---|
A STRUCTURED SELF-ATTENTIVE;Zhouhan Lin et al.;《Arxiv》;20170509;第1-15页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109635109A (en) | 2019-04-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109635109B (en) | Sentence classification method based on LSTM and combined with part-of-speech and multi-attention mechanism | |
CN108984745B (en) | Neural network text classification method fusing multiple knowledge maps | |
CN107133213B (en) | Method and system for automatically extracting text abstract based on algorithm | |
CN110765775B (en) | Self-adaptive method for named entity recognition field fusing semantics and label differences | |
CN113128229B (en) | Chinese entity relation joint extraction method | |
CN111783462A (en) | Chinese named entity recognition model and method based on dual neural network fusion | |
CN110263325B (en) | Chinese word segmentation system | |
CN111401061A (en) | Method for identifying news opinion involved in case based on BERT and Bi L STM-Attention | |
CN113312452B (en) | Chapter-level text continuity classification method based on multi-task learning | |
CN113392209B (en) | Text clustering method based on artificial intelligence, related equipment and storage medium | |
CN109299211B (en) | Automatic text generation method based on Char-RNN model | |
CN113673254B (en) | Knowledge distillation position detection method based on similarity maintenance | |
CN112163089B (en) | High-technology text classification method and system integrating named entity recognition | |
CN113255320A (en) | Entity relation extraction method and device based on syntax tree and graph attention machine mechanism | |
CN110874411A (en) | Cross-domain emotion classification system based on attention mechanism fusion | |
CN114881042B (en) | Chinese emotion analysis method based on graph-convolution network fusion of syntactic dependency and part of speech | |
CN112232053A (en) | Text similarity calculation system, method and storage medium based on multi-keyword pair matching | |
Chen et al. | Deep neural networks for multi-class sentiment classification | |
CN114462420A (en) | False news detection method based on feature fusion model | |
CN111914553A (en) | Financial information negative subject judgment method based on machine learning | |
Han et al. | An attention-based neural framework for uncertainty identification on social media texts | |
Verma et al. | Semantic similarity between short paragraphs using Deep Learning | |
Liu et al. | Research on advertising content recognition based on convolutional neural network and recurrent neural network | |
CN115169349A (en) | Chinese electronic resume named entity recognition method based on ALBERT | |
SiChen | A neural network based text classification with attention mechanism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |