CN111428519B - Entropy-based neural machine translation dynamic decoding method and system - Google Patents
Entropy-based neural machine translation dynamic decoding method and system Download PDFInfo
- Publication number
- CN111428519B CN111428519B CN202010151246.4A CN202010151246A CN111428519B CN 111428519 B CN111428519 B CN 111428519B CN 202010151246 A CN202010151246 A CN 202010151246A CN 111428519 B CN111428519 B CN 111428519B
- Authority
- CN
- China
- Prior art keywords
- entropy
- vector
- word
- time step
- target language
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The invention provides a neural machine translation dynamic decoding method and system based on entropy, which are used for finding that the average entropy value of words in a sentence with a high BLEU value is smaller than that of words in a sentence with a low BLEU value by analyzing the relation between the entropy value and the BLEU value of the sentence, and the BLEU value of the sentence with the low entropy value is generally higher than that of the sentence with the high entropy value. By calculating the Pearson coefficient between the entropy value and the BLEU value of the sentence, the correlation between the two is found. Therefore, the invention proposes that at each time step of the decoding stage in the training process, not only real words or predicted words are sampled and selected with a certain probability to obtain context information, but also entropy is calculated according to the prediction result of the previous time step, and then the weight of the context information is dynamically adjusted according to the entropy. The problem of error accumulation caused by context information difference between training and inference in the decoding process of the neural machine translation model is solved.
Description
Technical Field
The invention relates to the technical field of natural language processing and neural machine translation, in particular to a neural machine translation dynamic decoding method and system based on entropy.
Background
Machine translation is an important task in natural language processing, and in recent years, with the rise of deep neural networks, machine translation methods based on neural networks have made great progress and have gradually become mainstream machine translation methods. The neural machine translation model mainly comprises three parts: an encoder network, a decoder network, and an attention network.
The encoder network is responsible for encoding the source language sentence into a list of hidden vectors, one hidden vector representation for each word. The encoder network is typically a multi-layer bi-directional RNN structure, with forward RNNSequentially reading in a sequence of source language sentences (from x)1To x|x|) Calculating to obtain a forward hidden state sequenceReverse RNNReading in the Source language sentence sequence in reverse order (from x)|x|To x1) Calculating to obtain a reverse hidden state sequenceWord xiThe corresponding hidden vector is expressed asThus hiNot only contains the semantic information of the preceding words, but also contains the semantic information of the following words.
The attention network generates a list of hidden vectors (h) from the encoder network1,…,h|x|) And the current hidden state vector sj-1Computing a context vector cjAnd passed to the decoder network. First, calculate the hidden vector list (h)1,…,h|x|) With the current hidden layer state vector sj-1The degree of correlation between the two is obtained to obtain a weight list (alpha)1j,…,α|x|j) Then, the weight list is used to perform weighted summation on the hidden vector list to calculate a context vector cjFor the next hidden state vector sjAnd (4) calculating.
The decoder network is typically a multi-layer RNN structure, each time step being based on the current word vectorHidden layer state vector sj-1And context vector c calculated by attention networkjCalculating the hidden state vector s of the next time stepjAnd decoding a target language word yjUntil a special end of sentence symbol (EOS) is generated.
The existing neural machine translation model architecture is shown in fig. 1. Although the existing neural machine translation model has achieved good effects, some shortcomings still exist. In the prior art, the model decodes the target words in turn according to the context information. In the training phase, the model predicts using the real word as context information, while in the inference phase, it must generate the whole sequence from scratch, and can predict using the prediction result of the previous time step as context information. This difference in context information between training and inference results in an accumulation of errors, such that the model must predict without seeing the training phase.
In existing neural-machine translation models, each time step of the decoding process is based on the current word vectorHidden layer state vector sj-1And context vector c calculated by attention networkjCalculating the hidden state vector s of the next time stepjI.e. byIn the training phase, yj-1Is a real target language word in the training corpusWhile in the inference phase, yj-1Is a target language word predicted at the last time stepTo solve the difference between training and inference, the model samples the context information from the real sequence and the predicted sequence with a certain probability in the training phase, rather than merely selecting the target language word in the real sequence, i.e. selecting the target language word in the real sequenceAlthough the method reduces the difference between the training phase and the inference phase to a certain extent and improves the translation effect, when the sampling selects the words in the prediction sequenceDue to uncertainty of prediction, prediction error is introduced in the training process, and robustness of the model is reduced.
Disclosure of Invention
The invention aims to solve the problem of error accumulation of a neural machine translation model in a decoding process due to the context information difference between training and inference. The prior art provides a method for keeping consistency of training and prediction in machine translation, and in the training process, a correct word or a predicted word is sampled and selected for each decoding position with a certain probability, so that the flow of training and prediction can be kept consistent. However, this method introduces prediction errors into the training process when selecting words in the prediction sequence due to the uncertainty of the prediction itself, reducing the robustness of the model. According to the invention, by analyzing the correlation between the entropy value of the sentence and the double pre-evaluation substitution value (BLEU), on the basis of a method for keeping the consistency of training and prediction in machine translation, the weight of context information is dynamically adjusted according to the entropy value, and the influence of uncertainty on the translation result is reduced.
Specifically, the invention provides a neural machine translation dynamic decoding method based on entropy, which comprises
Step 2, the attention network according to the code vector list (h)1,…,h|x|) And hidden state vector sj-1To obtain a context vector cj;
Step 3, obtaining the j-1 th real target language word in the training corpusAnd j-1 time step predicted target language wordAnd selects the real target language word with probability pSelecting predicted target language words with a probability of 1-pObtaining a selection result yj-1;
Step 4, according to the selection result yj-1The entropy e of the j-1 time step is obtained by the following formulaj-1Where N is the size of the target language lexicon,
step 5, selecting a result yj-1Hidden state vector sj-1And a context vector cjAnd entropy value ej-1Inputting the signal into a decoder network to obtain a hidden layer state vector s of the current jth time stepj,pi,j-1A predicted probability for the ith target language word;
step 6, according to yj-1Hidden state vector sjAnd a context vector cjTo obtain the target language word of the jth time step
Step 7, list of encoding vectors (h)1,…,h|x|) The hidden state vector s of the jth time stepjJ, the actual target language word in the training corpusAnd target language words predicted at jth time stepAnd (5) transmitting the j +1 th time step, and continuing the decoding process until a special end of sentence symbol EOS is generated.
The neural machine translation dynamic decoding method based on entropy, wherein the step 2 comprises:
by paying attention toThe force network obtains the context vector cjWeight αijIs in a hidden layer state hiWith respect to hidden state sj-1In determining the next hidden state sjAnd predicting yjImportance of (1);
where x is the source language sentence, Va、WaAnd UaAre all parameters to be learned in the neural network.
The neural machine translation dynamic decoding method based on entropy, wherein the step 3 comprises:
where μ is the hyperparameter and e is the number of training rounds.
The neural machine translation dynamic decoding method based on entropy, wherein the step 6 comprises:
oj=Wotj;
Pj=softmax(oj);
wherein eyj-1Entropy of the probability distribution of the predicted word at j-1 time step, WoIs a parameter to be learned in the neural network.
The neural machine translation dynamic decoding method based on entropy, wherein the step 7 comprises:
the encoder network computes the list of hidden vectors (h) corresponding to the source language sentence1,…,h|x|),Is the word xiIs used to represent the word vector of (a),is the word xiThe corresponding hidden vector representation is represented by a hidden vector,
the invention also provides a neural machine translation dynamic decoding system based on entropy, which comprises
Module 2, attention network from the list of code vectors (h)1,…,h|x|) And hidden state vector sj-1To obtain a context vector cj;
Module 3, obtaining j-1 real target language word in training corpusAnd j-1 time step predicted target language wordAnd selects the real target language word with probability pSelecting predicted target language words with a probability of 1-pObtaining a selection result yj-1;
Module 4, according to the selection result yj-1The entropy e of the j-1 time step is obtained by the following formulaj-1Where N is the size of the target language lexicon,
module 5, selecting result yj-1Hidden state vector sj-1And a context vector cjAnd entropy value ej-1Inputting the signal into a decoder network to obtain a hidden layer state vector s of the current jth time stepj,pi,j-1A predicted probability for the ith target language word;
module 6 according to yj-1Hidden state vector sjAnd a context vector cjTo obtain the target language word of the jth time step
Module 7, List of encoding vectors (h)1,…,h|x|) The hidden state vector s of the jth time stepjJ, the actual target language word in the training corpusAnd target language words predicted at jth time stepAnd (5) transmitting the j +1 th time step, and continuing the decoding process until a special end of sentence symbol EOS is generated.
The neural machine translation dynamic decoding system based on entropy, wherein the module 2 comprises:
obtaining the context vector c through an attention networkjWeight αijIs in a hidden layer state hiWith respect to hidden state sj-1In determining the next hidden state sjAnd predicting yjImportance of (1);
where x is the source language sentence, Va、WaAnd UaAre all parameters to be learned in the neural network.
The neural machine translation dynamic decoding system based on entropy, wherein the module 3 comprises:
where μ is the hyperparameter and e is the number of training rounds.
The neural machine translation dynamic decoding system based on entropy, wherein the module 6 comprises:
oj=Wotj;
Pj=softmax(oj);
wherein eyj-1Entropy of the probability distribution of the predicted word at j-1 time step, WoIs a parameter to be learned in the neural network.
The neural machine translation dynamic decoding system based on entropy, wherein the module 7 comprises:
the encoder network computes the list of hidden vectors (h) corresponding to the source language sentence1,…,h|x|),Is the word xiIs used to represent the word vector of (a),is the word xiThe corresponding hidden vector representation is represented by a hidden vector,
drawings
FIG. 1 is a diagram of a prior art neural machine translation model architecture;
FIG. 2 is yj-1Sampling a picture;
FIG. 3 is a GRU calculation schematic;
fig. 4 is a flowchart of a method for entropy-based dynamic decoding.
Detailed Description
When the inventor conducts the research of the neural machine translation technology, the relation between the entropy value and the BLEU value of a sentence is analyzed, and the fact that the average entropy value of words in the sentence with a high BLEU value is smaller than the average entropy value of words in the sentence with a low BLEU value is found, and the BLEU value of the sentence with a low entropy value is higher than the BLEU value of the sentence with a high entropy value is found. The inventor finds that there is a correlation between the entropy value and the BLEU value of a sentence by calculating the Pearson coefficient between the two. Therefore, the invention proposes that at each time step of the decoding stage in the training process, not only real words or predicted words are sampled and selected with a certain probability to obtain context information, but also entropy is calculated according to the prediction result of the previous time step, and then the weight of the context information is dynamically adjusted according to the entropy.
In order to make the aforementioned features and effects of the present invention more comprehensible, embodiments accompanied with figures are described in detail below.
RNN-basedNMTModel
Encoder:
The encoder network computes a list of hidden vectors (h) corresponding to the source language sentence1,…,h|x|),Is the word xiIs used to represent the word vector of (a),is the word xiThe corresponding hidden vector representation.
Attention:
Attention network computing context vector cjWeight αijReflects the hidden layer state hiWith respect to hidden state sj-1In determining the next hidden state sjAnd predicting yjOf importance in (1).
Wherein Va、WaAnd UaAre all parameters to be learned in the neural network.
Decoder:
The decoder network decodes the target language words in turn until a special end of sentence symbol (EOS) is generated.
yj-1(the j-1 word in the target language sentence) is selected (its role is to select, and it is only the calculation formula to realize) by the probability of p (p is the sampling probability, and the value is obtained by formula 6. concretely, it is to generate a sampling vector of 0-1 by the value of probability p, and by the way of multiplication, the position with sampling probability of 1 is selectedThe position with the sampling probability of 0 is selectedFor example,the probability p is 0.3, and the value of the sampling vector is [1,0,0,1,0,0,1,0,0,0]Then, then Real target language wordsSelecting predicted target language words with a probability of 1-pμ is the hyperparameter and e is the number of training rounds. As shown in fig. 2.
Computing a hidden state vector sj:
GRU2The calculation principle of (c) is shown in fig. 3.
[ c ] in formula (9)j;ej-1](vector stitching) corresponds to x in FIG. 3t,Corresponding to h in FIG. 3t-1,sjCorresponding to h in FIG. 3tAccording to the entropy value ej-1Adjustment cjThe weight of (c). Entropy value ej-1The larger the uncertainty, the more the descriptionThe worse the translation, and therefore the less utilization in the next time stepMore utilizes cjThe information of (1).
The formula for calculating the entropy value is as follows, where N is the size of the target language dictionary. The prediction probability at the j-1 time step is denoted as Pj-1It is an N-dimensional vector representing the predicted probabilities of all words in the target language lexicon, where the predicted probability of the ith target language word is denoted as pi,j-1。
Probability distribution P over all words in the target language dictionaryjThe calculation formula of (a) is as follows:
oj=Wotj (12)
Pj=softmax(oj) (13)
wherein eyj-1Is an entropy value reflecting the uncertainty of the probability distribution of the predicted word at the j-1 time step, WoAre the parameters to be learned in the network.
The use of the above-described entropy-based dynamic decoding technique is explained:
first, word vectors of words in source language sentencesTransmitting into encoder network, and obtaining encoding vector list (h) corresponding to source language sentence according to formulas (1) - (2)1,…,h|x|). The target language words are then decoded in sequence until a special end of sentence symbol (EOS) is generated. The specific decoding process at the jth time step is as follows:
step S1, known quantity: code vector list (h)1,…,h|x|) Hidden state vector s at the j-1 st time stepj-1(the whole decoding process is carried out backward along with the time step, which is a concrete decoding process of the jth time step, so that the first j-1 time steps are all calculated), and the jth-1 real target language word in the training corpusTarget language word predicted at j-1 time step
Step S2, list of known encoding vectors (h)1,…,h|x|) And hidden state vector sj-1The attention network computes a context vector c according to equations (3) - (5)j。
Step S3, knowing the real target language wordAnd predicted target language wordsSampling selection y according to equations (6) - (7)j-1。
Step S4, known yj-1Calculating an entropy value e according to the formula (10)j-1。
Step S5, knowing the entropy value ej-1Target language word yj-1Hidden state vector sj-1And a context vector cjThe decoder network calculates the hidden state vector s for the jth time step according to equations (8) - (9)j。
Step S6, known yj-1Hidden layerState vector sjAnd a context vector cjPredicting the target language word at the jth time step according to equations (11) - (14)
Step S7, encoding vector list (h)1,…,h|x|) The hidden state vector s of the jth time stepjJ, the actual target language word in the training corpusAnd target language words predicted at jth time stepAnd (4) transmitting the j +1 th time step, and continuing the decoding process until a special end of sentence symbol (EOS) is generated.
The following are system examples corresponding to the above method examples, and this embodiment can be implemented in cooperation with the above embodiments. The related technical details mentioned in the above embodiments are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the above-described embodiments.
The invention also provides a neural machine translation dynamic decoding system based on entropy, which comprises
Module 2, attention network from the list of code vectors (h)1,…,h|x|) And hidden state vector sj-1To obtain a context vector cj;
Module 3, obtaining j-1 real target language word in training corpusAnd j-1 time step predicted target language wordAnd selects the real target language word with probability pSelecting predicted target language words with a probability of 1-pObtaining a selection result yj-1;
Module 4, according to the selection result yj-1The entropy e of the j-1 time step is obtained by the following formulaj-1Where N is the size of the target language lexicon,
module 5, selecting result yj-1Hidden state vector sj-1And a context vector cjAnd entropy value ej-1Inputting the signal into a decoder network to obtain a hidden layer state vector s of the current jth time stepj,pi,j-1A predicted probability for the ith target language word;
module 6 according to yj-1Hidden state vector sjAnd a context vector cjTo obtain the target language word of the jth time step
Module 7, List of encoding vectors (h)1,…,h|x|) The hidden state vector s of the jth time stepjJ, the actual target language word in the training corpusAnd target language words predicted at jth time stepAnd (5) transmitting the j +1 th time step, and continuing the decoding process until a special end of sentence symbol EOS is generated.
The neural machine translation dynamic decoding system based on entropy, wherein the module 2 comprises:
obtaining the context vector c through an attention networkjWeight αijIs in a hidden layer state hiWith respect to hidden state sj-1In determining the next hidden state sjAnd predicting yjImportance of (1);
where x is the source language sentence, Va、WaAnd UaAre all parameters to be learned in the neural network.
The neural machine translation dynamic decoding system based on entropy, wherein the module 3 comprises:
where μ is the hyperparameter and e is the number of training rounds.
The neural machine translation dynamic decoding system based on entropy, wherein the module 6 comprises:
oj=Wotj;
Pj=softmax(oj);
wherein eyj-1Entropy of the probability distribution of the predicted word at j-1 time step, WoIs a parameter to be learned in the neural network.
The neural machine translation dynamic decoding system based on entropy, wherein the module 7 comprises:
the encoder network computes the list of hidden vectors (h) corresponding to the source language sentence1,…,h|x|),Is the word xiIs used to represent the word vector of (a),is the word xiThe corresponding hidden vector representation is represented by a hidden vector,
Claims (10)
1. an entropy-based neural machine translation dynamic decoding method is characterized by comprising
Step 1, transmitting word vectors of words in a source language sentence in a training corpus into an encoder network to obtain a code vector list (h) of the source language sentence1,…,h|x|) And hidden state vector s of j-1 time stepj-1;
Step 2, the attention network according to the code vector list (h)1,…,h|x|) And hidden state vector sj-1To obtain a context vector cj;
Step 3, obtaining the j-1 th real target language word in the training corpusAnd j-1 time step predicted target language wordAnd selects the real target language word with probability pSelecting predicted target language words with a probability of 1-pObtaining a selection result yj-1;
Step 4, according to the selection result yj-1The entropy e of the j-1 time step is obtained by the following formulaj-1Where N is the size of the target language lexicon,
step 5, selecting a result yj-1Hidden state vector sj-1Context vector cjAnd entropy value ej-1Inputting the signal into a decoder network to obtain a hidden layer state vector s of the current jth time stepj,pi,j-1A predicted probability for the ith target language word;
step 6, according to yj-1Hidden state vector sjAnd a context vector cjTo obtain the target language word of the jth time step
Step 7, list of encoding vectors (h)1,…,h|x|) The hidden state vector s of the jth time stepjJ, the actual target language word in the training corpusAnd target language words predicted at jth time stepAnd (5) transmitting the j +1 th time step, and continuing the decoding process until a special end of sentence symbol EOS is generated.
2. An entropy-based neural machine translation dynamic decoding method as claimed in claim 1, wherein the step 2 comprises:
obtaining the context vector c through an attention networkjWeight αijIs in a hidden layer state hiWith respect to hidden state sj-1In determining the next hidden state sjAnd predicting yjImportance of (1);
5. An entropy-based neural machine translation dynamic decoding method as claimed in claim 4, wherein the step 7 comprises:
the encoder network calculates the code vector corresponding to the source language sentenceList (h)1,…,h|x|),Is the word xiIs used to represent the word vector of (a),is the word xiThe corresponding hidden vector representation is represented by a hidden vector,
6. an entropy-based neural machine translation dynamic decoding system is characterized by comprising
Module 1, transmitting word vectors of words in source language sentences in training corpus into encoder network to obtain encoding vector list (h) of the source language sentences1,…,h|x|) And hidden state vector s of j-1 time stepj-1;
Module 2, attention network from the list of code vectors (h)1,…,h|x|) And hidden state vector sj-1To obtain a context vector cj;
Module 3, obtaining j-1 real target language word in training corpusAnd j-1 time step predicted target language wordAnd selects the real target language word with probability pSelecting predicted target language words with a probability of 1-pObtaining a selection result yj-1;
Module 4, according to the selection result yj-1The entropy e of the j-1 time step is obtained by the following formulaj-1Where N is the size of the target language lexicon,
module 5, selecting result yj-1Hidden state vector sj-1Context vector cjAnd entropy value ej-1Inputting the signal into a decoder network to obtain a hidden layer state vector s of the current jth time stepj,pi,j-1A predicted probability for the ith target language word;
module 6 according to yj-1Hidden state vector sjAnd a context vector cjTo obtain the target language word of the jth time step
Module 7, List of encoding vectors (h)1,…,h|x|) The hidden state vector s of the jth time stepjJ, the actual target language word in the training corpusAnd target language words predicted at jth time stepAnd (5) transmitting the j +1 th time step, and continuing the decoding process until a special end of sentence symbol EOS is generated.
7. An entropy-based neural machine translation dynamic decoding system as claimed in claim 6, wherein the module 2 comprises:
obtaining the context vector c through an attention networkjWeight αijIs in a hidden layer state hiWith respect to hidden state sj-1In determining the next hidden state sjAnd predicting yjImportance of (1);
10. An entropy-based neural machine translation dynamic decoding system as claimed in claim 9, wherein the module 7 comprises:
the encoder network computes the list of encoding vectors (h) corresponding to the source language sentence1,…,h|x|),Is the word xiIs used to represent the word vector of (a),is the word xiThe corresponding hidden vector representation is represented by a hidden vector,
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010151246.4A CN111428519B (en) | 2020-03-06 | 2020-03-06 | Entropy-based neural machine translation dynamic decoding method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010151246.4A CN111428519B (en) | 2020-03-06 | 2020-03-06 | Entropy-based neural machine translation dynamic decoding method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111428519A CN111428519A (en) | 2020-07-17 |
CN111428519B true CN111428519B (en) | 2022-03-29 |
Family
ID=71547442
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010151246.4A Active CN111428519B (en) | 2020-03-06 | 2020-03-06 | Entropy-based neural machine translation dynamic decoding method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111428519B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112016332B (en) * | 2020-08-26 | 2021-05-07 | 华东师范大学 | Multi-modal machine translation method based on variational reasoning and multi-task learning |
CN112836485B (en) * | 2021-01-25 | 2023-09-19 | 中山大学 | Similar medical record prediction method based on neural machine translation |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110795912A (en) * | 2019-09-19 | 2020-02-14 | 平安科技(深圳)有限公司 | Method, device and equipment for encoding text based on neural network and storage medium |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10049106B2 (en) * | 2017-01-18 | 2018-08-14 | Xerox Corporation | Natural language generation through character-based recurrent neural networks with finite-state prior knowledge |
CN108984539B (en) * | 2018-07-17 | 2022-05-17 | 苏州大学 | Neural machine translation method based on translation information simulating future moment |
-
2020
- 2020-03-06 CN CN202010151246.4A patent/CN111428519B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110795912A (en) * | 2019-09-19 | 2020-02-14 | 平安科技(深圳)有限公司 | Method, device and equipment for encoding text based on neural network and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN111428519A (en) | 2020-07-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10762305B2 (en) | Method for generating chatting data based on artificial intelligence, computer device and computer-readable storage medium | |
US11210475B2 (en) | Enhanced attention mechanisms | |
US11776531B2 (en) | Encoder-decoder models for sequence to sequence mapping | |
CN107870902B (en) | Neural machine translation system | |
CN110442878B (en) | Translation method, training method and device of machine translation model and storage medium | |
CN114787914A (en) | System and method for streaming end-to-end speech recognition with asynchronous decoder | |
CN110326002B (en) | Sequence processing using online attention | |
CN111488807A (en) | Video description generation system based on graph convolution network | |
WO2020048389A1 (en) | Method for compressing neural network model, device, and computer apparatus | |
CN112528655B (en) | Keyword generation method, device, equipment and storage medium | |
CN111128137A (en) | Acoustic model training method and device, computer equipment and storage medium | |
CN110929092A (en) | Multi-event video description method based on dynamic attention mechanism | |
CN110569505B (en) | Text input method and device | |
CN110598224A (en) | Translation model training method, text processing device and storage medium | |
CN111428519B (en) | Entropy-based neural machine translation dynamic decoding method and system | |
Li et al. | End-to-end speech recognition with adaptive computation steps | |
US10783452B2 (en) | Learning apparatus and method for learning a model corresponding to a function changing in time series | |
CN108763230B (en) | Neural machine translation method using external information | |
CN111401081A (en) | Neural network machine translation method, model and model forming method | |
CN113609284A (en) | Method and device for automatically generating text abstract fused with multivariate semantics | |
Mezzoudj et al. | An empirical study of statistical language models: n-gram language models vs. neural network language models | |
US11694041B2 (en) | Chapter-level text translation method and device | |
CN109635302B (en) | Method and device for training text abstract generation model | |
Brakel et al. | An actor-critic algorithm for sequence prediction | |
CN114730380A (en) | Deep parallel training of neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |