CN109508457B - Transfer learning method based on machine reading to sequence model - Google Patents
Transfer learning method based on machine reading to sequence model Download PDFInfo
- Publication number
- CN109508457B CN109508457B CN201811284309.2A CN201811284309A CN109508457B CN 109508457 B CN109508457 B CN 109508457B CN 201811284309 A CN201811284309 A CN 201811284309A CN 109508457 B CN109508457 B CN 109508457B
- Authority
- CN
- China
- Prior art keywords
- model
- sequence
- vector
- layer
- machine reading
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000013526 transfer learning Methods 0.000 title abstract description 9
- 238000012549 training Methods 0.000 claims abstract description 22
- 238000013528 artificial neural network Methods 0.000 claims abstract description 18
- 230000000306 recurrent effect Effects 0.000 claims abstract description 13
- 239000013598 vector Substances 0.000 claims description 38
- 230000014509 gene expression Effects 0.000 claims description 15
- 230000015654 memory Effects 0.000 claims description 13
- 230000007246 mechanism Effects 0.000 claims description 10
- 238000013508 migration Methods 0.000 claims description 10
- 230000005012 migration Effects 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 8
- 230000002457 bidirectional effect Effects 0.000 claims description 7
- 230000007787 long-term memory Effects 0.000 claims description 5
- 239000011159 matrix material Substances 0.000 claims description 4
- 230000006403 short-term memory Effects 0.000 claims description 4
- 230000010354 integration Effects 0.000 claims description 3
- 230000003993 interaction Effects 0.000 claims description 3
- 238000012886 linear function Methods 0.000 claims description 3
- 238000003058 natural language processing Methods 0.000 description 7
- 238000013519 translation Methods 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 4
- 230000001537 neural effect Effects 0.000 description 3
- 230000000052 comparative effect Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 206010025482 malaise Diseases 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013074 reference sample Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a transfer learning method based on machine reading to a sequence model, which comprises the following steps: (1) pre-training a machine reading model, wherein the machine reading model comprises a coding layer and a model layer based on a recurrent neural network; (2) establishing a sequence model, wherein the sequence model comprises a coder and a decoder based on a recurrent neural network; (3) extracting parameters of a coding layer and a model layer in a trained machine reading model, transferring the parameters into a sequence model to be trained, and using the parameters as part of initialization parameters when the sequence model is trained; (4) training the sequence model until the model converges; (5) and performing a text sequence prediction task by using the trained sequence model. By using the method and the device, the text inclusion information can be deeply mined, and the quality of the generated text sequence is improved.
Description
Technical Field
The invention belongs to the technical field of natural language processing, and particularly relates to a transfer learning method based on machine reading to a sequence model.
Background
Machine reading is one of the most popular and troublesome problems in natural language processing, requiring models to understand natural language and be able to exploit existing knowledge. The most popular task at present is to give an article and a question, and we need to find the answer from the article according to the question. With the recent release of several high quality data sets, neural network based models have performed better and better on machine reading, even beyond humans on some data sets. An efficient machine-reading model can be widely applied to a plurality of fields based on semantic understanding, such as a conversation robot, a question-answering system, a search engine and the like.
The sequence model with attention mechanism mainly comprises an encoder and a decoder, wherein the encoder encodes an input sequence and then the decoder sequentially outputs the encoded input sequence and generates a sequence. Such structures have enjoyed tremendous success in natural language generation tasks such as machine translation, text summarization and dialog systems. However, when training such encoder-decoder, we can only optimize the output result against a fixed reference sample, and it is difficult to deeply understand the latent semantic information contained in the text.
And (4) transfer learning, which means that knowledge or characteristics in various fields are combined to establish a new model and probability distribution. In the field of natural language processing, transfer learning is widely applied. For example, in 2011, Natural Language Processing (almost) from Scratch published in the Journal of international top-level Machine Learning theory Research of international top-level Machine Learning theory discloses a uniform neural network structure and can apply unsupervised Learning to a plurality of Natural Language Processing tasks such as part of speech tagging and entity naming recognition; a ' rounded in transitions ' context-transformed Vectors ' published in 2017 on the International Top-level computing Neural theory Conference on Neural Information Processing Systems discloses a method for migrating a machine-translated coder after pre-training to a text classification task and question-answering system as a new Word vector to improve the richness of the original Word vector; a training method based on conjunctions is disclosed in 'Disable Marker augmented network with relationship Learning and Learning for Natural Language Inference' published in International Top-level Natural Language processing conference Proceedings of the 56th Annual Meeting and for computerized Linguities in 2018.
However, the existing natural language processing migration learning method rarely transfers the multilayer neural network to other tasks, and only migrating the coding layer can lose a large amount of information of the original pre-training model.
Disclosure of Invention
The invention provides a transfer learning method based on a sequence model read by a machine, which can more deeply mine text inclusion information and improve the quality of a generated text sequence.
The technical scheme adopted by the invention is as follows:
a migration learning method based on machine reading to sequence model comprises the following steps:
(1) pre-training a machine reading model, wherein the machine reading model comprises a coding layer and a model layer based on a recurrent neural network;
(2) establishing a sequence model, wherein the sequence model comprises an encoder, a decoder and an attention mechanism based on a recurrent neural network;
(3) extracting parameters of a coding layer and a model layer in a trained machine reading model, transferring the parameters into a sequence model to be trained, and using the parameters as part of initialization parameters of the training sequence model;
(4) training the sequence model until the model converges;
(5) and performing a text sequence prediction task by using the trained sequence model.
The method comprises the steps of pre-training a machine reading model comprising a coding layer and a model layer to serve as a migration source, embedding the coding layer and the model layer into a sequence model to be fused with an existing coding result, and finally outputting probability distribution of labels. The method can help the sequence model to understand the meaning of the text more deeply and generate a more natural text.
In the step (1), the recurrent neural network in the coding layer is a bidirectional long-short time memory network, and the recurrent neural network in the model layer is a unidirectional long-short time memory network.
In the step (1), the pre-training machine model comprises the following specific steps:
(1-1) selecting training data, performing word embedding on an input text by using a word vector Glove, and then sending the word embedded word into a bidirectional long-time memory network of a coding layer;
(1-2) connecting each hidden unit side by side to form the expression of the whole sentence in the direction, and combining the sentence expressions in two directions to be used as the final expression of the input sequence;
(1-3) combining the final expression of the article sequence and the final expression of the question sequence into an attention mechanism of a model, and outputting an attention matrix;
(1-4) inputting an attention moment array into a one-way long-short time memory network of a model layer, regularizing by using a hidden unit of the network, and outputting predicted probability distribution;
(1-5) repeating the above steps until the machine reading model converges.
In the step (2), the sequence model mainly comprises an encoder and a decoder, in order to keep the same with the parameters of the migration source, a long-time and short-time memory network is also adopted as the main parameter component of the sequence model, and a recurrent neural network in the encoder is a bidirectional long-time and short-time memory network.
In the step (3), the extracted parameters of the coding layer and the model layer are cyclic neural networks in the coding layer and the model layer. And respectively extracting the network of the coding layer and the network of the model layer, and transferring the networks into a sequence model to be trained to be used as part of initialization parameters of the training sequence model.
The specific steps of the step (4) are as follows:
(4-1) simultaneously sending the input word sequence into an encoder of the sequence model and a coding layer of the migrated machine reading model to obtain a coded merging vector;
(4-2) sending the merged vector into a one-way long-short-time memory for integration to obtain a coded vector integrated with the input text sequence;
(4-3) taking the integrated coding vector as an initialization vector of a decoder, and performing attention interaction on a hiding unit of the decoder and a unit integrating the vector to obtain an attention vector atWhere t is the t-th word decoded;
(4-4) attention vector atInputting the model layer into the migrated machine reading model layer, and then outputting the output vector r of the model layertAnd attention vector atIntegrating by using a linear function and sending the integrated result into a softmax function to obtain the probability distribution of the prediction sequence; the formula of the softmax function is as follows:
P(yt|y<t,x)=softmax(Wpat+Wqrt+bp)
wherein, Wp、WqAnd bpAre all parameters to be trained, ytIs the t-th word output by the decoder.
(4-5) repeating the above steps until the model converges.
The invention has the following beneficial effects:
1. the invention uses transfer learning to transfer the knowledge learned in other question-answering systems to the text generation task, thus improving the accuracy of the structure of the coder-decoder and ensuring the whole model to be simple and visual.
2. The method fully utilizes the high performance of the existing machine reading model, the transferred parameters comprise multilayer neural networks, the trained machine reading model parameters are randomly initialized instead of the sequence model parameters, and the sequence model can be helped to more deeply mine the information contained in the text, so that the content is deeper, and the quality of the generated text sequence is improved.
Drawings
FIG. 1 is a flow chart of a transfer learning method based on machine reading to sequence model according to the present invention;
FIG. 2 is a schematic diagram of the overall structure of the machine reading model and the sequence model of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantageous effects of the present invention more clearly apparent, the technical contents and specific embodiments of the present invention are described in further detail below with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the preferred embodiment of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
As shown in fig. 1, a migration learning method based on machine reading to sequence model includes the following steps:
s01, pre-training a machine reading model.
We use the Stanford question-answer dataset SQuAD, a large-scale, high-quality corpus, as a training set, and our task is to predict the answer, which is a continuous field in an article, given an article and a question.
Referring to fig. 2, the existing word vector Glove is used for word embedding of an input text, and then the word vector Glove is sent into a bidirectional long-time memory network (BiLSTM) of an Encoding Layer (Encoding Layer). We join each hidden unit side by side to form the expression of the whole sentence in the direction, and merge the sentence expressions in two directions together as the final expression of the input sequence. Subsequently, we incorporated the expression of the article sequence and the expression of the question sequence into the Attention Mechanism (Attention Mechanism). The Attention mechanism is a function composed of a series of regularized linear operations and logical operations, and can be specifically referred to the contents from page 3 to page 4 in Bi-directional orientation Flow for mechanism compression published in the International conference on Learning and characterization conference International in 2017. The output of our attention mechanism is a matrix of attention vectors for the number of article words. Finally, we input the attention matrix into a model Layer (Modeling Layer) one-way long-short time memory network (LSTM), and use hidden units of the network to regularize and use the softmax function to output the predicted probability distribution.
And S02, extracting the coding layer and model layer parameters of the machine reading model. In step S01, we refer to a long-term and short-term memory network, which is a kind of recurrent neural network and is also the parameter we will extract. The network of the coding layer and the network of the model layer are extracted respectively and prepared to be used as initialization parameters of the next task.
S03, the parameters extracted in step S02 are embedded in the sequence model as initialization of the partial parameters.
Structure of sequence model referring to fig. 2, the sequence model mainly consists of an Encoder (Encoder) and a Decoder (Decoder), and in order to keep the same with the parameters of the migration source, we also use a long-term memory network as the main parameter component of the sequence model. We first input the input word sequence into the sequence model encoder and migrate itObtaining a coded merging vector in a coder of the machine reading model; and then, sending the merged vector into a one-way long-short time memory network for integration to obtain a coding vector integrated by the coders from two different sources on the input text sequence. The integrated coding vector is used as an initialization vector of a decoder, and attention interaction is carried out on a hiding unit of the decoder and a unit of the integrated vector to obtain an attention vector atWhere t is the t-th word decoded. For a general sequence model, the attention vector is finally sent to a softmax function for regularization and generation of a predicted probability distribution:
P(yt|y<t,x)=softmax(Wpat+bp)
wherein WpAnd bpAre all parameters to be trained, ytIs the t-th word output by the decoder. However, in the method of the present invention, we first input the attention vector into the migrated machine-read model layer, and then input the model layer's output vector rtAnd integrating the predicted sequence with the original attention vector by using a linear function and sending the integrated predicted sequence into a softmax function to obtain the probability distribution of the predicted sequence:
P(yt|y<t,x)=softmax(Wpat+Wqrt+bp)
wherein, WqIs the parameter to be trained.
And S04, starting training the sequence model by taking the trained migration parameters as initialization and other parameters as random initialization until convergence.
And S05, performing text sequence prediction tasks such as machine translation, text summarization and the like by using the trained model.
In order to prove the effectiveness of the method, a comparison experiment is carried out on two tasks of neural machine translation and generation type text summarization. On a machine translation task, a WMT2014 and WMT2015 English-to-German corpus is adopted; on the task of text summarization, two data sets of CNN/Daily Mail and Gigaword are adopted. The CNN/Daily Mail contains 287k training data pairs after being preprocessed, and the Gigaword contains 3.8M training data pairs after being preprocessed.
The results of comparative experiments on machine translation tasks are shown in table 1. In table 1, the first column is the base model, the middle column is the one-by-one addition of the details of the method, and the last column is the method. It can be seen that, on the machine translation task, the method (MacNet) of the invention is obviously improved compared with a basic model (Baseline), and the effectiveness is proved by performing comparison tests on all details.
TABLE 1
The results of the comparative experiments for the text summarization task are shown in table 2. This experiment was compared on the text summary test set with the published method that works best currently. Overall, the method of the invention (Pointer-Generator + MacNet) has a higher accuracy than other methods and achieves the best results at present on most of the indices on both data sets.
TABLE 2
In addition, we show in detail several examples demonstrating the visual impact on generating text summaries before and after the incorporation of the method of the present invention, as shown in table 3.
TABLE 3
In the above table, PG is an abbreviation of a basic model pointer generator, Reference is a Reference answer given in a data set, and PG + Macnet is a model added to the method of the present invention. It can be seen that when an uncommon word appears in the original text, the original basic model is difficult to summarize a better subject-to-predicate object; and when the original text is long and the structure is complex, the original basic model even shows the language sickness. However, after the method of the invention is added, the finally generated text abstract sentences are smooth and natural, and the expressed main body idea is basically in place.
The embodiments described in this specification are only for illustrative purposes and are not intended to limit the invention, the scope of the invention should not be limited to the specific embodiments described in the embodiments, and any modifications, substitutions, changes, etc. within the spirit and principle of the invention are included in the scope of the invention.
Claims (5)
1. A migration learning method based on machine reading to sequence model is characterized by comprising the following steps:
(1) pre-training a machine reading model, wherein the machine reading model comprises a coding layer and a model layer based on a recurrent neural network;
(2) establishing a sequence model, wherein the sequence model comprises an encoder, a decoder and an attention mechanism based on a recurrent neural network;
(3) extracting parameters of a coding layer and a model layer in a trained machine reading model, transferring the parameters into a sequence model to be trained, and using the parameters as part of initialization parameters when the sequence model is trained;
(4) training a sequence model, specifically comprising the following steps:
(4-1) simultaneously sending the input word sequence into an encoder of the sequence model and a coding layer of the migrated machine reading model to obtain a coded merging vector;
(4-2) sending the merged vector into a one-way long-short-time memory for integration to obtain a coded vector integrated with the input text sequence;
(4-3) taking the integrated coding vector as an initialization vector of a decoder, and performing attention interaction on a hiding unit and a vector integrating unit of the decoder to obtain an attention vectorWhereintIs the first to decodetA word;
(4-4) will be notedVector of the intention forceInputting the model layer into the migrated machine reading model layer, and then outputting the output vector of the model layerAnd attention vectorIntegrating by using a linear function and sending the integrated result into a softmax function to obtain the probability distribution of the prediction sequence;
(4-5) repeating the above steps until the model converges;
(5) and performing a text sequence prediction task by using the trained sequence model.
2. The method according to claim 1, wherein in step (1), the recurrent neural network in the coding layer is a bidirectional long-term and short-term memory network, and the recurrent neural network in the model layer is a unidirectional long-term and short-term memory network.
3. The machine-readable sequence model-based migration learning method according to claim 2, wherein in the step (1), the pre-training comprises the following specific steps:
(1-1) selecting training data, performing word embedding on an input text by using a word vector Glove, and then sending the word embedded word into a bidirectional long-time memory network of a coding layer;
(1-2) connecting each hidden unit side by side to form the expression of the whole sentence in the direction, and combining the sentence expressions in two directions to be used as the final expression of the input sequence;
(1-3) combining the final expression of the article sequence and the final expression of the question sequence into an attention mechanism of a model, and outputting an attention matrix;
(1-4) inputting an attention moment array into a one-way long-short time memory network of a model layer, regularizing by using a hidden unit of the network, and outputting predicted probability distribution;
(1-5) repeating the above steps until the machine reading model converges.
4. The method according to claim 1, wherein in step (2), the recurrent neural network in the encoder is a bidirectional long-term and short-term memory network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811284309.2A CN109508457B (en) | 2018-10-31 | 2018-10-31 | Transfer learning method based on machine reading to sequence model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811284309.2A CN109508457B (en) | 2018-10-31 | 2018-10-31 | Transfer learning method based on machine reading to sequence model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109508457A CN109508457A (en) | 2019-03-22 |
CN109508457B true CN109508457B (en) | 2020-05-29 |
Family
ID=65747209
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811284309.2A Active CN109508457B (en) | 2018-10-31 | 2018-10-31 | Transfer learning method based on machine reading to sequence model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109508457B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200364303A1 (en) * | 2019-05-15 | 2020-11-19 | Nvidia Corporation | Grammar transfer using one or more neural networks |
CN110188182B (en) * | 2019-05-31 | 2023-10-27 | 中国科学院深圳先进技术研究院 | Model training method, dialogue generating method, device, equipment and medium |
CN110188331B (en) * | 2019-06-03 | 2023-05-26 | 腾讯科技(深圳)有限公司 | Model training method, dialogue system evaluation method, device, equipment and storage medium |
CN110415702A (en) * | 2019-07-04 | 2019-11-05 | 北京搜狗科技发展有限公司 | Training method and device, conversion method and device |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108228571A (en) * | 2018-02-01 | 2018-06-29 | 北京百度网讯科技有限公司 | Generation method, device, storage medium and the terminal device of distich |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102521656B (en) * | 2011-12-29 | 2014-02-26 | 北京工商大学 | Integrated transfer learning method for classification of unbalance samples |
US20160350653A1 (en) * | 2015-06-01 | 2016-12-01 | Salesforce.Com, Inc. | Dynamic Memory Network |
US10776707B2 (en) * | 2016-03-08 | 2020-09-15 | Shutterstock, Inc. | Language translation based on search results and user interaction data |
CN105787560B (en) * | 2016-03-18 | 2018-04-03 | 北京光年无限科技有限公司 | Dialogue data interaction processing method and device based on Recognition with Recurrent Neural Network |
US20180260474A1 (en) * | 2017-03-13 | 2018-09-13 | Arizona Board Of Regents On Behalf Of The University Of Arizona | Methods for extracting and assessing information from literature documents |
CN107341146B (en) * | 2017-06-23 | 2020-08-04 | 上海交大知识产权管理有限公司 | Migratable spoken language semantic analysis system based on semantic groove internal structure and implementation method thereof |
CN107590138B (en) * | 2017-08-18 | 2020-01-31 | 浙江大学 | neural machine translation method based on part-of-speech attention mechanism |
-
2018
- 2018-10-31 CN CN201811284309.2A patent/CN109508457B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108228571A (en) * | 2018-02-01 | 2018-06-29 | 北京百度网讯科技有限公司 | Generation method, device, storage medium and the terminal device of distich |
Also Published As
Publication number | Publication date |
---|---|
CN109508457A (en) | 2019-03-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109508457B (en) | Transfer learning method based on machine reading to sequence model | |
CN107357789B (en) | Neural machine translation method fusing multi-language coding information | |
CN108717574B (en) | Natural language reasoning method based on word connection marking and reinforcement learning | |
CN111783462A (en) | Chinese named entity recognition model and method based on dual neural network fusion | |
WO2021022816A1 (en) | Intent identification method based on deep learning network | |
CN109992669B (en) | Keyword question-answering method based on language model and reinforcement learning | |
CN111078866B (en) | Chinese text abstract generation method based on sequence-to-sequence model | |
CN111723547A (en) | Text automatic summarization method based on pre-training language model | |
CN111581962B (en) | Text representation method based on subject word vector and hybrid neural network | |
CN108549644A (en) | Omission pronominal translation method towards neural machine translation | |
CN110765264A (en) | Text abstract generation method for enhancing semantic relevance | |
CN110874411A (en) | Cross-domain emotion classification system based on attention mechanism fusion | |
CN116306652A (en) | Chinese naming entity recognition model based on attention mechanism and BiLSTM | |
CN114881042B (en) | Chinese emotion analysis method based on graph-convolution network fusion of syntactic dependency and part of speech | |
CN113407663B (en) | Image-text content quality identification method and device based on artificial intelligence | |
Li et al. | Cm-gen: A neural framework for chinese metaphor generation with explicit context modelling | |
KR20210058059A (en) | Unsupervised text summarization method based on sentence embedding and unsupervised text summarization device using the same | |
CN113887251A (en) | Mongolian Chinese machine translation method combining Meta-KD framework and fine-grained compression | |
CN113743095A (en) | Chinese problem generation unified pre-training method based on word lattice and relative position embedding | |
CN117932066A (en) | Pre-training-based 'extraction-generation' answer generation model and method | |
CN114997143B (en) | Text generation model training method and system, text generation method and storage medium | |
CN114519353B (en) | Model training method, emotion message generation method and device, equipment and medium | |
CN113377908B (en) | Method for extracting aspect-level emotion triple based on learnable multi-word pair scorer | |
Cho | Introduction to neural machine translation with GPUs (part 3) | |
Wang | Text emotion detection based on Bi-LSTM network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |