CN109271643A - A kind of training method of translation model, interpretation method and device - Google Patents
A kind of training method of translation model, interpretation method and device Download PDFInfo
- Publication number
- CN109271643A CN109271643A CN201810896694.XA CN201810896694A CN109271643A CN 109271643 A CN109271643 A CN 109271643A CN 201810896694 A CN201810896694 A CN 201810896694A CN 109271643 A CN109271643 A CN 109271643A
- Authority
- CN
- China
- Prior art keywords
- hidden state
- rnn
- training
- translation model
- time step
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013519 translation Methods 0.000 title claims abstract description 117
- 238000012549 training Methods 0.000 title claims abstract description 114
- 238000000034 method Methods 0.000 title claims abstract description 37
- 239000013598 vector Substances 0.000 claims abstract description 88
- 230000011218 segmentation Effects 0.000 claims abstract description 22
- 238000012545 processing Methods 0.000 claims abstract description 16
- 238000000605 extraction Methods 0.000 claims abstract description 14
- 230000006870 function Effects 0.000 claims description 25
- 238000007781 pre-processing Methods 0.000 claims description 13
- 239000000284 extract Substances 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000007423 decrease Effects 0.000 claims description 5
- 238000012360 testing method Methods 0.000 description 19
- 239000000463 material Substances 0.000 description 9
- 230000008569 process Effects 0.000 description 7
- 235000012054 meals Nutrition 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 238000007476 Maximum Likelihood Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000002203 pretreatment Methods 0.000 description 1
- 238000013441 quality evaluation Methods 0.000 description 1
- 230000007306 turnover Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
Abstract
The embodiment of the invention provides a kind of training method of translation model, interpretation method and device, which includes: extraction training corpus;Training corpus is pre-processed, preprocessed text is obtained;Word segmentation processing is carried out to preprocessed text, obtains participle text information;Participle text information is encoded from forward and reverse based on two-way RNN encoder, and determines two-way RNN encoder in the hidden state of each time step;It is decoded based on hidden state and semantic vector of the undirected RNN decoder to each time step of two-way RNN encoder and establishes translation model.It avoids and the semantic vector of all time steps is compressed in fixed length vector, it causes context detailed information to dilute or be capped and all time steps of decoder causes translation model to translate the problem of accuracy declines with reference to same fixed length vector, so that decoder is decoded in each time step with reference to different semantic vectors, the accuracy that translation model translates source statement is improved.
Description
Technical field
The present invention relates to translation technology fields, more particularly to a kind of training method of translation model, interpretation method and turn over
Translate training device, the translating equipment of model.
Background technique
Currently, the conversion of source statement to object statement is actually a kind of conversion of sequence to sequence (seq2seq), in order to
It realizes the conversion of sequence to sequence, generallys use the realization of coding-decoded model (Encoder-Decoder) frame.
List entries, is exactly embedded as the vector of theorem in Euclid space by coding;Decoding, the vector exactly encoded are converted to
Output sequence, the process coded and decoded in the prior art can be realized by neural network model RNN.
In existing coding and decoding frame, list entries is compressed into the vector of a fixed length by encoder, then from fixed length
Vector decoding generates output sequence, and wherein the fixed length vector includes each of source statement detailed information, when sentence source statement is long
It is detailed information meeting fixed, that the content that source statement first inputs carries since fixed length vector includes information content when spending long
It by the detailed information dilution of the content of rear input or is capped, the longer especially source statement length the more serious, this is allowed for
The general details information of source statement list entries can not be obtained in decoder decoding, cause the decoded accuracy of decoder by
Negative effect, reduces the accuracy that traditional code-decoded model converts source statement.
Summary of the invention
The embodiment of the present invention the technical problem to be solved is that provide the training method of translation model a kind of, interpretation method and
Training device, the translating equipment of translation model, to solve the problems, such as that existing coding-decoding translation model translation accuracy is low.
To solve the above-mentioned problems, the invention discloses a kind of training methods of translation model, comprising:
Extract the training corpus of preset quantity at random from preset Parallel Corpus;
The training corpus is pre-processed, preprocessed text is obtained;
Word segmentation processing is carried out to the preprocessed text, obtains participle text information;
The participle text information is encoded from forward and reverse based on two-way RNN encoder, is determined described two-way
RNN encoder each time step hidden state, and, based on undirected RNN decoder to each of the two-way RNN encoder
The hidden state and semantic vector of time step are decoded, and establish translation model.
Optionally, described that the participle text information is encoded from forward and reverse based on two-way RNN encoder, really
Determining the two-way RNN encoder in the step of hidden state of each time step includes:
Positive RNN encodes the participle text information according to forward direction to obtain positive term vector characteristic sequence XF=
(X1, X2..., Xt), and positive hidden state Fhi is generated in each time step i, wherein Fhi=(Fh1,Fh2..., Fht), i=
L, 2 ..., T;F indicates the hidden state parameter of forward direction of translation model;
Reversed RNN is according to reversely encoding the participle text information to obtain reversed term vector characteristic sequence XB=
(Xt, Xt-1..., X2, X1), and reversed hidden state Bh is generated in each time step ii, wherein Bhi=(Bh1, Bh2..., Bht),
I=l, 2 ..., T;The reversed hidden state parameter of B expression translation model;
According to the hidden state Fh of the forward directioniWith reversed hidden state BhiDetermine the two-way RNN encoder in each time step
Hidden state hi, wherein hi=[Fhi, Bhi]。
Optionally, it is described based on undirected RNN decoder to the hidden state and semanteme of each time step of two-way RNN encoder
The step of vector is decoded, and establishes translation model include:
It is described based on undirected RNN decoder to the hidden state of each time step of two-way RNN encoder and semantic vector into
Row decoding, obtains the decoded state function of translation model.
Optionally, it is described based on undirected RNN decoder to the hidden state and semanteme of each time step of two-way RNN encoder
The step of vector is decoded, and obtains the decoded state function of translation model include:
Acquisition time walks the decoded state S of the undirected RNN decoder of i-1i-1And corresponding label Yi-1;
Obtain the hidden state h of the two-way RNN encoder of current time step iiWith semantic vector Ci;
According to decoded state Si-1, label Yi-1, hidden state hiAnd semantic vector CiDetermine that current time step i is corresponding
The decoded state S of undirected RNN decoderi;
Wherein, Si=P (Si-1, Yi-1, hi, Ci), P () indicates decoded state function.
Optionally, the semantic vector CiFor the h of the hidden state of the two-way RNN encoderi=[h1, h2..., ht] plus
Quan He.
Optionally, described the step of establishing translation model further include:
The training objective corpus that the training corpus is aligned is extracted from the Parallel Corpus;
The probability that each training corpus predicts the training objective corpus is calculated according to decoded state function;
According to default loss function and the probability calculation loss late;
Gradient is calculated using the loss late;
Judge whether the gradient meets default iterated conditional;
If so, terminating translation model training;
If it is not, gradient decline is carried out to the model parameter of the translation model using the gradient and preset learning rate,
Return to the step of extracting the training objective corpus that the training corpus is aligned from the Parallel Corpus.
To solve the above-mentioned problems, the embodiment of the invention discloses a kind of interpretation methods, comprising:
Obtain sentence to be translated;
Object statement will be extracted in translation model that the input by sentence to be translated is trained in advance;
Wherein, the translation model is trained in the following manner:
Extract the training corpus of preset quantity at random from preset Parallel Corpus;
The training corpus is pre-processed, preprocessed text is obtained;
Word segmentation processing is carried out to the preprocessed text, obtains participle text information;
The participle text information is encoded from forward and reverse based on two-way RNN encoder, is determined described two-way
RNN encoder each time step hidden state, and, based on undirected RNN decoder to each of the two-way RNN encoder
The hidden state and semantic vector of time step are decoded, and establish translation model.
The embodiment of the invention also discloses a kind of training devices of translation model, comprising:
Training corpus extraction module, for extracting the training corpus of preset quantity at random from preset Parallel Corpus;
Preprocessing module obtains preprocessed text for pre-processing to the training corpus;
Word segmentation module obtains participle text information for carrying out word segmentation processing to the preprocessed text;
Modeling module, for being encoded from forward and reverse to the participle text information based on two-way RNN encoder,
Determine the two-way RNN encoder in the hidden state of each time step, and, based on undirected RNN decoder to the two-way RNN
The hidden state and semantic vector of each time step of encoder are decoded, and establish translation model.
Optionally, the modeling module includes:
Positive encoding submodule is encoded to obtain forward direction according to forward direction for positive RNN to the participle text information
Term vector characteristic sequence XF=(X1, X2..., Xt), and positive hidden state Fhi is generated in each time step i, wherein Fhi=
(Fh1,Fh2..., Fht), i=l, 2 ..., T;F indicates the hidden state parameter of forward direction of translation model;
Phase-reversal coding submodule according to reversely is encoded to obtain reversed for reversed RNN to the participle text information
Term vector characteristic sequence XB=(Xt, Xt-1..., X2, X1), and reversed hidden state Bh is generated in each time step ii, wherein Bhi
=(Bh1, Bh2..., Bht), i=l, 2 ..., T;The reversed hidden state parameter of B expression translation model;
The two-way hidden state of RNN encoder determines submodule, for according to the hidden state Fh of the forward directioniWith reversed hidden state Bhi
Determine the two-way RNN encoder in the hidden state h of each time stepi, wherein hi=[Fhi, Bhi]。
Optionally, the modeling module includes:
Decoding sub-module, for based on undirected RNN decoder to the hidden state of each time step of two-way RNN encoder and
Semantic vector is decoded, and obtains the decoded state function of translation model.
Optionally, the decoding sub-module includes:
Preceding time step state acquisition submodule, the decoded state S of the undirected RNN decoder for acquisition time step i-1i-1
And corresponding label Yi-1;
Current time walks hidden state and semantic vector acquisition submodule, for obtaining the two-way RNN coding of current time step i
The hidden state h of deviceiWith semantic vector Ci;
Decoded state determines submodule, for according to decoded state Si-1, label Yi-1, hidden state hiAnd semantic vector Ci
Determine the decoded state S of the corresponding undirected RNN decoder of current time step ii;
Wherein, Si=P (Si-1, Yi-1, hi, Ci), P () indicates decoded state function.
Optionally, the semantic vector CiFor the h of the hidden state of the two-way RNN encoderi=[h1, h2..., ht] plus
Quan He.
Optionally, the training device further include:
Testing material extraction module, for extracting testing material, the testing material at random from the Parallel Corpus
Including test source corpus and test target corpus;
Probability evaluation entity predicts the test target corpus for calculating each test source corpus according to decoded state function
Probability;
Loss late computing module, for according to default loss function and the probability calculation loss late;
Gradient computing module, for calculating gradient using the loss late;
Iterated conditional judgment module, for judging whether the gradient meets default iterated conditional;
Training ending module, for then terminating translation model training;
Parameter adjustment module, for using the gradient and preset learning rate to the model parameter of the translation model into
The decline of row gradient, returns to testing material extraction module.
The embodiment of the invention also discloses a kind of translating equipments, comprising:
Sentence to be translated obtains module, for obtaining sentence to be translated;
Object statement extraction module extracts target in the translation model for training the input by sentence to be translated in advance
Sentence;
Wherein, the translation model passes through with lower module training:
Training corpus extraction module, for extracting the training corpus of preset quantity at random from preset Parallel Corpus;
Preprocessing module obtains preprocessed text for pre-processing to the training corpus;
Word segmentation module obtains participle text information for carrying out word segmentation processing to the preprocessed text;
Modeling module, for being encoded from forward and reverse to the participle text information based on two-way RNN encoder,
Determine the two-way RNN encoder in the hidden state of each time step, and, based on undirected RNN decoder to the two-way RNN
The hidden state and semantic vector of each time step of encoder are decoded, and establish translation model.
Compared with the background art, the embodiment of the present invention includes following advantages:
In the embodiment of the present invention, participle text information is encoded from forward and reverse based on two-way RNN encoder, and
Determine two-way RNN encoder in the hidden state of each time step;Each time of the undirected RNN decoder to two-way RNN encoder
The hidden state and semantic vector of step are decoded, and establish translation model, when two-way RNN encoder encodes from positive and negative both direction into
Row encodes and determines the hidden state and semantic vector of each time step, avoid by the hidden state of all time steps and it is semantic to
Amount be compressed in a fixed length vector, avoid all information be all compressed in a fixed length vector cause information dilute or by
The problem that covering and each time step of decoder cause translation model translation accuracy low with reference to the same fixed length vector, is improved
The accuracy that translation model translates source statement.
Detailed description of the invention
Fig. 1 is a kind of step flow chart of the training method embodiment of translation model of the invention;
Fig. 2 is a kind of step flow chart of interpretation method embodiment of the invention;
Fig. 3 is a kind of structural block diagram of the training device embodiment of translation model of the invention;
Fig. 4 is a kind of block diagram of translating equipment of the invention.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, with reference to the accompanying drawing and specific real
Applying mode, the present invention is described in further detail.
Referring to Fig.1, a kind of step flow chart of the training method embodiment of translation model of the embodiment of the present invention is shown,
It can specifically include following steps:
Step 101, the training corpus of preset quantity is extracted at random from preset Parallel Corpus.
Parallel Corpus is also known as translated corpora, is the corpus collectively constituted by original text and translation, is used for machine translation
Training, test of model etc., such as can be by Chinese and Uighur, Chinese and English, Chinese and Japanese, Japanese and English
The corpus that equal original texts and translation collectively constitute.
In the embodiment of the present invention, the training corpus of preset quantity can be extracted at random from Parallel Corpus, such as from the Chinese
The language of 1000 pairs of Chinese and Uighur composition is extracted in the Parallel Corpus of language and Uighur composition to as training corpus,
It is translation that Uighur, which can be defined, and Chinese is original text.
Step 102, the training corpus is pre-processed, obtains preprocessed text.
In embodiments of the present invention, training corpus can be carried out at the pretreatments such as regularization, error correction, digital regularization
Reason.
Step 103, word segmentation processing is carried out to the preprocessed text, obtains participle text information.
Word segmentation processing can be carried out after pre-processing to training corpus, obtain the participle text information of training corpus,
Such as word segmentation processing is carried out to pretreated source statement, the participle text information of source statement is obtained, for example, treated source language
Sentence be " I does not have a meal this noon ", carry out word segmentation processing after the available other participle text information of character level: " I ",
" the present ", " day ", " in ", " noon ", " no ", " eating ", " meal ", " ", ".".
Step 104, the participle text information is encoded from forward and reverse based on two-way RNN encoder, is determined
The two-way RNN encoder each time step hidden state, and, the two-way RNN is encoded based on undirected RNN decoder
The hidden state and semantic vector of each time step of device are decoded, and establish translation model.
In one preferred embodiment of the invention, translation can be established by the coding and decoding to participle text information
Model, specifically, step 104 may include following sub-step:
Sub-step S11, positive RNN are encoded to obtain positive term vector feature according to forward direction to the participle text information
Sequence XF=(X1, X2..., Xt), and positive hidden state Fhi is generated in each time step i, wherein Fhi=(Fh1,Fh2...,
Fht), i=l, 2 ..., T;F indicates the hidden state parameter of forward direction of translation model.
In practical applications, a dictionary can be preset, the corresponding coding of each word, every in the dictionary in the dictionary
The coding of a word be it is unique, the coding of the corresponding each word of participle text information can be searched by the dictionary, then according to
Sequence forms positive term vector characteristic sequence.
Such as in dictionary, the coding of following character is as follows, " I ": 102, " the present ": 38, " day ": 5, " in ": 138,
" noon ": 321, " no ": 8, " eating ": 29, " meal ": 290, " ": 202, ".": then segment text information: " I ", " the present ", " day ",
" in ", " noon ", " no ", " eating ", " meal ", " ", "." positive term vector characteristic sequence are as follows: [102,38,5,138,321,8,
29,290,202,0].
Meanwhile positive term vector characteristic sequence is inputted in positive RNN, by RNN according to the hidden state parameter F of preset forward direction
The hidden state Fhi of forward direction for calculating each time step i obtains the hidden state Fh of forward direction of all time stepsi=(Fh1, Fh2...,
Fht), i=l, 2 ..., T, wherein T is time step.
Sub-step S12, reversed RNN to the participle text information according to reversely being encoded to obtain reversed term vector feature
Sequence XB=(Xt, Xt-1..., X2, X1), and reversed hidden state Bh is generated in each time step ii, wherein Bhi=(Bh1,
Bh2..., Bht), i=l, 2 ..., T;The reversed hidden state parameter of B expression translation model.
Such as segmenting text information in sub-step S11: " I ", " the present ", " day ", " in ", " noon ", " no ", " eating ",
" meal ", " ", ".", reversed term vector characteristic sequence is obtained after reversed RNN coding are as follows: [0,202,290,29,8,321,
138,5,38,102].
Meanwhile reversed term vector characteristic sequence is inputted in reversed RNN, by RNN according to preset reversed hidden state parameter B
The hidden state Bhi of back for calculating each time step i obtains the reversed hidden state Bh of all time stepsi=(Bh1, Bh2...,
Bht), i=l, 2 ..., T, wherein T is time step.
Sub-step S13, according to the hidden state Zh of the forward directioniWith reversed hidden state FhiDetermine that the two-way RNN encoder exists
The hidden state h of each time stepi, wherein hi=[Fhi, Bhi]。
In practical applications, undirected RNN decoder can only be decoded each time step using a hidden state, therefore
It needs to be determined that hidden state of the two-way RNN encoder in each time step, it specifically can be according to the hidden shape of forward direction of each time step i
State FhiWith reversed hidden state BhiIt is comprehensive to determine two-way RNN encoder in the hidden state h of each time step ii=[Fhi, Bhi], example
It such as can be the hidden state h that summation mode determines each time step ii。
Sub-step S14, based on undirected RNN decoder to the hidden state and semanteme of each time step of two-way RNN encoder
Vector is decoded, and obtains the decoded state function of translation model.
In the embodiment of the present invention, sub-step S14 may include following sub-step:
Sub-step S141, acquisition time walk the decoded state S of the undirected RNN decoder of i-1i-1And corresponding label Yi-1;
Sub-step S142 obtains the hidden state h of the two-way RNN encoder of current time step iiWith semantic vector Ci;
Sub-step S143, according to decoded state Si-1, label Yi-1, hidden state hiAnd semantic vector CiDetermine current time
The decoded state S of the corresponding undirected RNN decoder of step-length ii;
Wherein, Si=P (Si-1, Yi-1, hi, Ci), P () indicates decoded state function.
In the embodiment of the present invention, semantic vector CiUndirected RNN decoder can be indicated in the every of output prediction object statement
Selection contextual information the most suitable when a participle, specifically, semantic vector CiIt can be the hidden shape of two-way RNN encoder
The h of statei=[h1, h2..., ht] weighted sum.
It can specifically be realized by following equation 1-3:
eik=g (si-1, hk) (3)
Wherein, g () is RNN neural network, i, j, and k indicates time step serial number, and i=1,2 ... ... .T, j=1,
2 ... ... .T, k=1,2 ... ... .T.
eikIt is output YiBy inputting XjThe probability of translation, αijIt is i-th of target word pair of output according to probability calculation
The weight for j-th of the source word answered, by weight αijWith hidden state hjSummation obtains semantic vector C after multiplicationi, i.e., each semantic vector
CiIllustrate output YiWhen refer to each input XjCorresponding hidden state hjWeight, that is, determine output YiWhich inputs X withjAssociation
It is even more important.
It is non-directional in decoding process since undirected RNN decoder is relative to two-way RNN encoder, it is being decoded
When, in addition to the hidden state h of each time step based on two-way RNN encoderiOutside, it is every to reference is also made to two-way RNN encoder
The semantic vector C of one time stepi, state S of the decoder in time step iiBe by decoder time step i-l state Si-1、
Corresponding label Yi-1, current time alignment two-way RNN encoder hidden state hiWith semantic vector CiIt codetermines, so that nothing
It can be decoded to each time step of RNN decoder with reference to different semantic vectors, avoid all information and be all compressed in
One fixed length vector cause contextual information dilute or the capped and each time step of decoder with reference to the same fixed length to
The problem that amount causes translation model translation accuracy low, so that decoder is carried out in each time step with reference to different semantic vectors
Decoding, improves the accuracy that translation model translates source statement.
In one preferred embodiment of the invention, can with the following steps are included:
Step 105, the training objective corpus that the training corpus is aligned is extracted from the Parallel Corpus.
In the embodiment of the present invention, training corpus and training objective corpus can be pairing, Ke Yicong in Parallel Corpus
The training objective corpus being aligned with training corpus is extracted in Parallel Corpus.
Step 106, the probability that each training corpus predicts the target corpus is calculated according to decoded state function.
Before training, model parameter, learning rate, the number of iterations in translation model are initialized, configure initial value,
Then the training corpus extracted at random is inputted in translation model, extracts candidate target corpus, candidate target corpus may include
It is multiple, wherein also including training objective corpus.Each candidate target corpus has a score value, such as can be each candidate
Target corpus belongs to the probability of the target training corpus of training corpus alignment.
In the concrete realization, candidate target corpus can be calculated by way of multiple regression belong to training objective corpus
Probability.
Step 107, according to default loss function and the probability calculation loss late.
In the training process, the score value of target training corpus is possible to not be inconsistent with the score value actually calculated, i.e. prediction result
There are deviations, it is therefore desirable to translation model parameter is adjusted according to loss late, then can according to preset loss function and
Probability calculation loss late.
Step 108, gradient is calculated using the loss late.
After obtaining loss late, gradient can be calculated and be adjusted with the parameter to model, it in practical applications, can be with
Gradient is calculated according to loss late by way of seeking local derviation.
Step 109, judge whether the gradient meets default iterated conditional, if executing step 110, execute step if not
111;
Step 110, terminate translation model training;
Step 111, the model parameter of the translation model is carried out under gradient using the gradient and preset learning rate
Drop, the model parameter include positive hidden state parameter and reversed hidden state parameter;And it returns and is extracted from the Parallel Corpus
The step of training objective corpus of the training corpus alignment.
If the gradient being calculated does not meet preset iterated conditional, be greater than such as the difference between continuous multiple gradients or
Equal to preset discrepancy threshold, or the number of iterations is not reached, then update the model parameter of translation model, such as adjustment translation mould
The hidden state parameter of forward direction and reversed hidden state parameter of type, are entered next using updated model parameter and preset learning rate
Iteration is taken turns, whereas if gradient meets preset iterated conditional, is preset as the difference between continuous multiple gradients is less than or equal to
Discrepancy threshold, or reach the number of iterations, then terminate to train, output model parameter.
In the training process can using SGD (stochastic gradient descent, stochastic gradient descent),
Adadelta and Adam (Adaptive Moment Estimation, adaptive moments estimation) carries out proposing gradient decline, while can
Loss late: MLE (Maximum Likelihood Estimation, Maximum-likelihood estimation is calculated to use following loss function
Method), MRT (Minimum Risk Training, minimum risk training) and SST (Semi-supervised Training,
Semi-supervised training), the embodiment of the present invention is without restriction to the loss function mentioning descending method and using.
Be illustrated below by way of comprising 4,000,000 pairs of dimension Chinese bilingual teaching mode building translation models: detailed process is such as
Under:
(1) 1000 pairs of corpus data preparation: are extracted at random in 4,000,000 pairs of corpus first as test set, then remaining
Under corpus centering extract 1000 pairs of corpus as verifying collection, remaining corpus is to as training corpus.
(2) system building: modeling framework is built on the server of outfit, and disposes RNN.
(3) translation model training.Setting coding vocabulary number is 100,000 words, passes through RNN parameter setting term vector dimension 600
Dimension;Adam optimizer is selected to realize the adaptively changing of learning rate, the setting the number of iterations upper limit is 1,000,000 times, has been initialized
At rear beginning model training.
(4) translation model is verified.It is tested, is finally obtained on 1000 test sets using trained translation model
BLEU (bilingual evaluation understudy) value, can be bilingual intertranslation quality evaluation auxiliary tool, BLEU
The similarity of machine translation and reference translation can be evaluated, if BLEU value is within a preset range, terminates translation model parameter
Adjustment, otherwise adjust translation model parameter until BLEU value within a preset range.
In the embodiment of the present invention, encode from positive and negative both direction when two-way RNN encoder encodes and determination is each
The hidden state and semantic vector of time step avoid the hidden state and semantic vector of all time steps being compressed in a fixed length
In vector, avoids all information and be all compressed in a fixed length vector and cause that information dilutes or capped and decoder is each
The problem that time step causes translation model translation accuracy low with reference to the same fixed length vector, improves translation model to source statement
The accuracy of translation.
Referring to Fig. 2, a kind of step flow chart of interpretation method embodiment of the embodiment of the present invention is shown, specifically can wrap
Include following steps:
Step 201, sentence to be translated is obtained.
In the embodiment of the present invention, sentence to be translated can be the text information that user directly inputs, such as user in PC, shifting
The sentence that the needs inputted in the equipment such as dynamic terminal are translated can also be and be converted to after voice capture device acquires voice signal
The sentence that text information obtains, such as sentence to be translated can be Chinese, and object statement can be Uighur.
Step 202, object statement will be extracted in translation model that the input by sentence to be translated is trained in advance.
In the embodiment of the present invention, translation model can be pre-established, it can be by statement translation to be translated by translation model
For object statement, such as by after chinese input translation model, it can be translated as Uighur,
Using the embodiment of the present invention, the translation model passes through following steps training:
Sub-step S21 extracts the training corpus of preset quantity at random from preset Parallel Corpus;
Sub-step S22 pre-processes the training corpus, obtains preprocessed text;
Sub-step S23 carries out word segmentation processing to the preprocessed text, obtains participle text information;
Sub-step S24 encodes the participle text information from forward and reverse based on two-way RNN encoder, really
The two-way RNN encoder is determined in the hidden state of each time step, and, the two-way RNN is compiled based on undirected RNN decoder
The hidden state and semantic vector of each time step of code device are decoded, and establish translation model.
The training process of translation model can be with reference to the correlation step in translation model training method, and this will not be detailed here.
In the embodiment of the present invention, is encoded and determined from positive and negative both direction when being encoded using two-way RNN encoder
The hidden state and semantic vector of each time step, and, using undirected RNN decoder to each time of two-way RNN encoder
The hidden state and semantic vector of step are decoded, and establish translation model, avoid by the hidden state of all time steps and it is semantic to
Amount is compressed in a fixed length vector, is avoided all information and is all compressed in a fixed length vector and contextual information is caused to dilute
Or the capped and each time step of decoder causes translation model translation accuracy is low to ask with reference to the same fixed length vector
Topic, improves the accuracy that translation model translates source statement.
It should be noted that for simple description, therefore, it is stated as a series of action groups for embodiment of the method
It closes, but those skilled in the art should understand that, embodiment of that present invention are not limited by the describe sequence of actions, because according to
According to the embodiment of the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art also should
Know, the embodiments described in the specification are all preferred embodiments, and the related movement not necessarily present invention is implemented
Necessary to example.
Referring to Fig. 3, a kind of structural block diagram of the training device embodiment of translation model of the embodiment of the present invention is shown, is had
Body may include following module:
Training corpus extraction module 301, for extracting the training language of preset quantity at random from preset Parallel Corpus
Material;
Preprocessing module 302 obtains preprocessed text for pre-processing to the training corpus;
Word segmentation module 303 obtains participle text information for carrying out word segmentation processing to the preprocessed text;
Modeling module 304, for being compiled from forward and reverse to the participle text information based on two-way RNN encoder
Code, determine the two-way RNN encoder in the hidden state of each time step, and, based on undirected RNN decoder to described two-way
The hidden state and semantic vector of each time step of RNN encoder are decoded, and establish translation model.
Optionally, the modeling module 304 includes:
Positive encoding submodule is encoded to obtain forward direction according to forward direction for positive RNN to the participle text information
Term vector characteristic sequence XF=(X1, X2..., Xt), and positive hidden state Fhi is generated in each time step i, wherein Fhi=
(Fh1,Fh2..., Fht), i=l, 2 ..., T;F indicates the hidden state parameter of forward direction of translation model;
Phase-reversal coding submodule according to reversely is encoded to obtain reversed for reversed RNN to the participle text information
Term vector characteristic sequence XB=(Xt, Xt-1..., X2, X1), and reversed hidden state Bh is generated in each time step ii, wherein Bhi
=(Bh1, Bh2..., Bht), i=l, 2 ..., T;The reversed hidden state parameter of B expression translation model;
The two-way hidden state of RNN encoder determines submodule, for according to the hidden state Fh of the forward directioniWith reversed hidden state Bhi
Determine the two-way RNN encoder in the hidden state h of each time stepi, wherein hi=[Fhi, Bhi]。
Optionally, the modeling module 304 includes:
Decoding sub-module, for based on undirected RNN decoder to the hidden state of each time step of two-way RNN encoder and
Semantic vector is decoded, and obtains the decoded state function of translation model.
Optionally, the decoding sub-module includes:
Preceding time step state acquisition submodule, the decoded state S of the undirected RNN decoder for acquisition time step i-1i-1
And corresponding label Yi-1;
Current time walks hidden state and semantic vector acquisition submodule, for obtaining the two-way RNN coding of current time step i
The hidden state h of deviceiWith semantic vector Ci;
Decoded state determines submodule, for according to decoded state Si-1, label Yi-1, hidden state hiAnd semantic vector Ci
Determine the decoded state S of the corresponding undirected RNN decoder of current time step ii;
Wherein, Si=P (Si-1, Yi-1, hi, Ci), P () indicates decoded state function.
Optionally, the semantic vector CiFor the h of the hidden state of the two-way RNN encoderi=[h1, h2..., ht] plus
Quan He.
Optionally, the training device further include:
Testing material extraction module, for extracting testing material, the testing material at random from the Parallel Corpus
Including test source corpus and test target corpus;
Probability evaluation entity predicts the test target corpus for calculating each test source corpus according to decoded state function
Probability;
Loss late computing module, for according to default loss function and the probability calculation loss late;
Gradient computing module, for calculating gradient using the loss late;
Iterated conditional judgment module, for judging whether the gradient meets default iterated conditional;
Training ending module, for then terminating translation model training;
Parameter adjustment module, for using the gradient and preset learning rate to the model parameter of the translation model into
The decline of row gradient, returns to testing material extraction module.
Referring to Fig. 4, a kind of structural block diagram of translating equipment embodiment of the embodiment of the present invention is shown, can specifically include
Following module:
Sentence to be translated obtains module 401, for obtaining sentence to be translated;
Object statement extraction module 402 extracts in the translation model for training the input by sentence to be translated in advance
Object statement;
Wherein, the translation model passes through with lower module training:
Training corpus extraction module, for extracting the training corpus of preset quantity at random from preset Parallel Corpus;
Preprocessing module obtains preprocessed text for pre-processing to the training corpus;
Word segmentation module obtains participle text information for carrying out word segmentation processing to the preprocessed text;
Modeling module, for being encoded from forward and reverse to the participle text information based on two-way RNN encoder,
Determine the two-way RNN encoder in the hidden state of each time step, and, based on undirected RNN decoder to the two-way RNN
The hidden state and semantic vector of each time step of encoder are decoded, and establish translation model.
For device embodiment, since it is basically similar to the method embodiment, related so being described relatively simple
Place illustrates referring to the part of embodiment of the method.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its
Its embodiment.The present invention is directed to cover any variations, uses, or adaptations of the invention, these modifications, purposes or
Person's adaptive change follows general principle of the invention and including the undocumented common knowledge in the art of the disclosure
Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following
Claim is pointed out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and
And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is limited only by the attached claims.
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and
Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.
Claims (10)
1. a kind of training method of translation model characterized by comprising
Extract the training corpus of preset quantity at random from preset Parallel Corpus;
The training corpus is pre-processed, preprocessed text is obtained;
Word segmentation processing is carried out to the preprocessed text, obtains participle text information;
The participle text information is encoded from forward and reverse based on two-way RNN encoder, determines that the two-way RNN is compiled
Code device each time step hidden state, and, based on undirected RNN decoder to each time of the two-way RNN encoder
The hidden state and semantic vector of step are decoded, and establish translation model.
2. training method as described in claim 1, which is characterized in that described to be based on two-way RNN encoder from forward and reverse
The participle text information is encoded, determines two-way RNN encoder packet the hidden state of each time step the step of
It includes:
Positive RNN encodes the participle text information according to forward direction to obtain positive term vector characteristic sequence XF=(X1,
X2..., Xt), and positive hidden state Fhi is generated in each time step i, wherein Fhi=(Fh1,Fh2..., Fht), i=l,
2 ..., T;F indicates the hidden state parameter of forward direction of translation model;
Reversed RNN is according to reversely encoding the participle text information to obtain reversed term vector characteristic sequence XB=(Xt,
Xt-1..., X2, X1), and reversed hidden state Bh is generated in each time step ii, wherein Bhi=(Bh1, Bh2..., Bht), i=
L, 2 ..., T;The reversed hidden state parameter of B expression translation model;
According to the hidden state Fh of the forward directioniWith reversed hidden state BhiDetermine the two-way RNN encoder in the hidden of each time step
State hi, wherein hi=[Fhi, Bhi]。
3. training method as claimed in claim 2, which is characterized in that described to be encoded based on undirected RNN decoder to two-way RNN
The step of hidden state and semantic vector of each time step of device are decoded, establish translation model include:
It is decoded, is obtained based on hidden state and semantic vector of the undirected RNN decoder to each time step of two-way RNN encoder
To the decoded state function of translation model.
4. training method as claimed in claim 3, which is characterized in that described to be encoded based on undirected RNN decoder to two-way RNN
The step of hidden state and semantic vector of each time step of device are decoded, obtain the decoded state function of translation model packet
It includes:
Acquisition time walks the decoded state S of the undirected RNN decoder of i-1i-1And corresponding label Yi-1;
Obtain the hidden state h of the two-way RNN encoder of current time step iiWith semantic vector Ci;
According to decoded state Si-1, label Yi-1, hidden state hiAnd semantic vector CiDetermine that current time step i is corresponding undirected
The decoded state S of RNN decoderi;
Wherein, Si=P (Si-1, Yi-1, hi, Ci), P () indicates decoded state function.
5. training method as claimed in claim 4, which is characterized in that the semantic vector CiFor the two-way RNN encoder
The h of hidden statei=[h1, h2..., ht] weighted sum.
6. training method as claimed in claim 3, which is characterized in that described the step of establishing translation model further include:
The training objective corpus that the training corpus is aligned is extracted from the Parallel Corpus;
The probability that each training corpus predicts the training objective corpus is calculated according to decoded state function;
According to default loss function and the probability calculation loss late;
Gradient is calculated using the loss late;
Judge whether the gradient meets default iterated conditional;
If so, terminating translation model training;
If it is not, carrying out gradient decline to the model parameter of the translation model using the gradient and preset learning rate, return
The step of extracting the training objective corpus that the training corpus is aligned from the Parallel Corpus.
7. a kind of interpretation method characterized by comprising
Obtain sentence to be translated;
Object statement will be extracted in translation model that the input by sentence to be translated is trained in advance;
Wherein, the translation model is trained in the following manner:
Extract the training corpus of preset quantity at random from preset Parallel Corpus;
The training corpus is pre-processed, preprocessed text is obtained;
Word segmentation processing is carried out to the preprocessed text, obtains participle text information;
The participle text information is encoded from forward and reverse based on two-way RNN encoder, determines that the two-way RNN is compiled
Code device each time step hidden state, and, based on undirected RNN decoder to each time of the two-way RNN encoder
The hidden state and semantic vector of step are decoded, and establish translation model.
8. a kind of training device of translation model characterized by comprising
Training corpus extraction module, for extracting the training corpus of preset quantity at random from preset Parallel Corpus;
Preprocessing module obtains preprocessed text for pre-processing to the training corpus;
Word segmentation module obtains participle text information for carrying out word segmentation processing to the preprocessed text;
Modeling module is determined for being encoded from forward and reverse to the participle text information based on two-way RNN encoder
The two-way RNN encoder each time step hidden state, and, the two-way RNN is encoded based on undirected RNN decoder
The hidden state and semantic vector of each time step of device are decoded, and establish translation model.
9. training device as claimed in claim 8, which is characterized in that the modeling module includes:
Positive encoding submodule, for positive RNN according to forward direction to the participle text information encoded to obtain positive word to
Measure feature sequence XF=(X1, X2..., Xt), and positive hidden state Fh is generated in each time step ii, wherein Fhi=(Fh1,
Fh2..., Fht), i=l, 2 ..., T;F indicates the hidden state parameter of forward direction of translation model;
Phase-reversal coding submodule, for reversed RNN according to reversely to the participle text information encoded to obtain reversed word to
Measure feature sequence XB=(Xt, Xt-1..., X2, X1), and reversed hidden state Bh is generated in each time step ii, wherein Bhi=
(Bh1, Bh2..., Bht), i=l, 2 ..., T;The reversed hidden state parameter of F expression translation model;
The two-way hidden state of RNN encoder determines submodule, for according to the hidden state Fh of the forward directioniWith reversed hidden state BhiIt determines
Hidden state h of the two-way RNN encoder in each time stepi, wherein hi=[Fhi, Bhi]。
10. a kind of translating equipment characterized by comprising
Sentence to be translated obtains module, for obtaining sentence to be translated;
Object statement extraction module extracts target language in the translation model for training the input by sentence to be translated in advance
Sentence;
Wherein, the translation model passes through with lower module training:
Training corpus extraction module, for extracting the training corpus of preset quantity at random from preset Parallel Corpus;
Preprocessing module obtains preprocessed text for pre-processing to the training corpus;
Word segmentation module obtains participle text information for carrying out word segmentation processing to the preprocessed text;
Modeling module is determined for being encoded from forward and reverse to the participle text information based on two-way RNN encoder
The two-way RNN encoder each time step hidden state, and, the two-way RNN is encoded based on undirected RNN decoder
The hidden state and semantic vector of each time step of device are decoded, and establish translation model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810896694.XA CN109271643A (en) | 2018-08-08 | 2018-08-08 | A kind of training method of translation model, interpretation method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810896694.XA CN109271643A (en) | 2018-08-08 | 2018-08-08 | A kind of training method of translation model, interpretation method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109271643A true CN109271643A (en) | 2019-01-25 |
Family
ID=65153188
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810896694.XA Pending CN109271643A (en) | 2018-08-08 | 2018-08-08 | A kind of training method of translation model, interpretation method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109271643A (en) |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109858044A (en) * | 2019-02-01 | 2019-06-07 | 成都金山互动娱乐科技有限公司 | Language processing method and device, the training method of language processing system and device |
CN109871946A (en) * | 2019-03-15 | 2019-06-11 | 北京金山数字娱乐科技有限公司 | A kind of application method and device, training method and device of neural network model |
CN109902313A (en) * | 2019-03-01 | 2019-06-18 | 北京金山数字娱乐科技有限公司 | A kind of interpretation method and device, the training method of translation model and device |
CN109931506A (en) * | 2019-03-14 | 2019-06-25 | 三川智慧科技股份有限公司 | Pipeline leakage detection method and device |
CN109933662A (en) * | 2019-02-15 | 2019-06-25 | 北京奇艺世纪科技有限公司 | Model training method, information generating method, device, electronic equipment and computer-readable medium |
CN110188353A (en) * | 2019-05-28 | 2019-08-30 | 百度在线网络技术(北京)有限公司 | Text error correction method and device |
CN110210026A (en) * | 2019-05-29 | 2019-09-06 | 北京百度网讯科技有限公司 | Voice translation method, device, computer equipment and storage medium |
CN110263350A (en) * | 2019-03-08 | 2019-09-20 | 腾讯科技(深圳)有限公司 | Model training method, device, computer readable storage medium and computer equipment |
CN110263349A (en) * | 2019-03-08 | 2019-09-20 | 腾讯科技(深圳)有限公司 | Corpus assessment models training method, device, storage medium and computer equipment |
CN110619357A (en) * | 2019-08-29 | 2019-12-27 | 北京搜狗科技发展有限公司 | Picture processing method and device and electronic equipment |
CN110795947A (en) * | 2019-08-30 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Sentence translation method and device, storage medium and electronic device |
CN110879940A (en) * | 2019-11-21 | 2020-03-13 | 哈尔滨理工大学 | Machine translation method and system based on deep neural network |
CN111027681A (en) * | 2019-12-09 | 2020-04-17 | 腾讯科技(深圳)有限公司 | Time sequence data processing model training method, data processing device and storage medium |
CN111597829A (en) * | 2020-05-19 | 2020-08-28 | 腾讯科技(深圳)有限公司 | Translation method and device, storage medium and electronic equipment |
CN111680528A (en) * | 2020-06-09 | 2020-09-18 | 合肥讯飞数码科技有限公司 | Translation model compression method, device, equipment and storage medium |
CN112287656A (en) * | 2020-10-12 | 2021-01-29 | 四川语言桥信息技术有限公司 | Text comparison method, device, equipment and storage medium |
CN112597778A (en) * | 2020-12-14 | 2021-04-02 | 华为技术有限公司 | Training method of translation model, translation method and translation equipment |
CN112926342A (en) * | 2019-12-06 | 2021-06-08 | 中兴通讯股份有限公司 | Method for constructing machine translation model, translation device and computer readable storage medium |
CN113468856A (en) * | 2020-03-31 | 2021-10-01 | 阿里巴巴集团控股有限公司 | Variant text generation method, variant text translation model training method, variant text classification device and variant text translation model training device |
CN113836192A (en) * | 2021-08-13 | 2021-12-24 | 深译信息科技(横琴)有限公司 | Parallel corpus mining method and device, computer equipment and storage medium |
CN114254657A (en) * | 2021-12-23 | 2022-03-29 | 科大讯飞股份有限公司 | Translation method and related equipment thereof |
CN114333830A (en) * | 2020-09-30 | 2022-04-12 | 中兴通讯股份有限公司 | Simultaneous interpretation model training method, simultaneous interpretation method, device and storage medium |
TWI765437B (en) * | 2020-11-30 | 2022-05-21 | 中華電信股份有限公司 | System, method and computer-readable medium for translating chinese text into taiwanese or taiwanese pinyin |
CN114997185A (en) * | 2021-10-27 | 2022-09-02 | 荣耀终端有限公司 | Translation method, medium, program product, and electronic device |
CN116611458A (en) * | 2023-05-31 | 2023-08-18 | 本源量子计算科技(合肥)股份有限公司 | Text translation method and device, medium and electronic device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106126507A (en) * | 2016-06-22 | 2016-11-16 | 哈尔滨工业大学深圳研究生院 | A kind of based on character-coded degree of depth nerve interpretation method and system |
CN107464559A (en) * | 2017-07-11 | 2017-12-12 | 中国科学院自动化研究所 | Joint forecast model construction method and system based on Chinese rhythm structure and stress |
CN107729329A (en) * | 2017-11-08 | 2018-02-23 | 苏州大学 | A kind of neural machine translation method and device based on term vector interconnection technique |
-
2018
- 2018-08-08 CN CN201810896694.XA patent/CN109271643A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106126507A (en) * | 2016-06-22 | 2016-11-16 | 哈尔滨工业大学深圳研究生院 | A kind of based on character-coded degree of depth nerve interpretation method and system |
CN107464559A (en) * | 2017-07-11 | 2017-12-12 | 中国科学院自动化研究所 | Joint forecast model construction method and system based on Chinese rhythm structure and stress |
CN107729329A (en) * | 2017-11-08 | 2018-02-23 | 苏州大学 | A kind of neural machine translation method and device based on term vector interconnection technique |
Cited By (38)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109858044A (en) * | 2019-02-01 | 2019-06-07 | 成都金山互动娱乐科技有限公司 | Language processing method and device, the training method of language processing system and device |
CN109858044B (en) * | 2019-02-01 | 2023-04-18 | 成都金山互动娱乐科技有限公司 | Language processing method and device, and training method and device of language processing system |
CN109933662A (en) * | 2019-02-15 | 2019-06-25 | 北京奇艺世纪科技有限公司 | Model training method, information generating method, device, electronic equipment and computer-readable medium |
CN109902313A (en) * | 2019-03-01 | 2019-06-18 | 北京金山数字娱乐科技有限公司 | A kind of interpretation method and device, the training method of translation model and device |
CN109902313B (en) * | 2019-03-01 | 2023-04-07 | 北京金山数字娱乐科技有限公司 | Translation method and device, and translation model training method and device |
CN110263349B (en) * | 2019-03-08 | 2024-09-13 | 腾讯科技(深圳)有限公司 | Corpus evaluation model training method and device, storage medium and computer equipment |
CN110263350A (en) * | 2019-03-08 | 2019-09-20 | 腾讯科技(深圳)有限公司 | Model training method, device, computer readable storage medium and computer equipment |
CN110263349A (en) * | 2019-03-08 | 2019-09-20 | 腾讯科技(深圳)有限公司 | Corpus assessment models training method, device, storage medium and computer equipment |
CN110263350B (en) * | 2019-03-08 | 2024-05-31 | 腾讯科技(深圳)有限公司 | Model training method, device, computer readable storage medium and computer equipment |
CN109931506A (en) * | 2019-03-14 | 2019-06-25 | 三川智慧科技股份有限公司 | Pipeline leakage detection method and device |
CN109871946A (en) * | 2019-03-15 | 2019-06-11 | 北京金山数字娱乐科技有限公司 | A kind of application method and device, training method and device of neural network model |
CN110188353A (en) * | 2019-05-28 | 2019-08-30 | 百度在线网络技术(北京)有限公司 | Text error correction method and device |
CN110188353B (en) * | 2019-05-28 | 2021-02-05 | 百度在线网络技术(北京)有限公司 | Text error correction method and device |
CN110210026B (en) * | 2019-05-29 | 2023-05-26 | 北京百度网讯科技有限公司 | Speech translation method, device, computer equipment and storage medium |
CN110210026A (en) * | 2019-05-29 | 2019-09-06 | 北京百度网讯科技有限公司 | Voice translation method, device, computer equipment and storage medium |
CN110619357A (en) * | 2019-08-29 | 2019-12-27 | 北京搜狗科技发展有限公司 | Picture processing method and device and electronic equipment |
CN110619357B (en) * | 2019-08-29 | 2022-03-04 | 北京搜狗科技发展有限公司 | Picture processing method and device and electronic equipment |
CN110795947A (en) * | 2019-08-30 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Sentence translation method and device, storage medium and electronic device |
CN110879940A (en) * | 2019-11-21 | 2020-03-13 | 哈尔滨理工大学 | Machine translation method and system based on deep neural network |
CN110879940B (en) * | 2019-11-21 | 2022-07-12 | 哈尔滨理工大学 | Machine translation method and system based on deep neural network |
CN112926342A (en) * | 2019-12-06 | 2021-06-08 | 中兴通讯股份有限公司 | Method for constructing machine translation model, translation device and computer readable storage medium |
CN111027681A (en) * | 2019-12-09 | 2020-04-17 | 腾讯科技(深圳)有限公司 | Time sequence data processing model training method, data processing device and storage medium |
CN111027681B (en) * | 2019-12-09 | 2023-06-27 | 腾讯科技(深圳)有限公司 | Time sequence data processing model training method, data processing method, device and storage medium |
CN113468856A (en) * | 2020-03-31 | 2021-10-01 | 阿里巴巴集团控股有限公司 | Variant text generation method, variant text translation model training method, variant text classification device and variant text translation model training device |
CN111597829A (en) * | 2020-05-19 | 2020-08-28 | 腾讯科技(深圳)有限公司 | Translation method and device, storage medium and electronic equipment |
CN111597829B (en) * | 2020-05-19 | 2021-08-27 | 腾讯科技(深圳)有限公司 | Translation method and device, storage medium and electronic equipment |
CN111680528A (en) * | 2020-06-09 | 2020-09-18 | 合肥讯飞数码科技有限公司 | Translation model compression method, device, equipment and storage medium |
CN111680528B (en) * | 2020-06-09 | 2023-06-30 | 合肥讯飞数码科技有限公司 | Translation model compression method, device, equipment and storage medium |
CN114333830A (en) * | 2020-09-30 | 2022-04-12 | 中兴通讯股份有限公司 | Simultaneous interpretation model training method, simultaneous interpretation method, device and storage medium |
CN112287656A (en) * | 2020-10-12 | 2021-01-29 | 四川语言桥信息技术有限公司 | Text comparison method, device, equipment and storage medium |
CN112287656B (en) * | 2020-10-12 | 2024-05-28 | 四川语言桥信息技术有限公司 | Text comparison method, device, equipment and storage medium |
TWI765437B (en) * | 2020-11-30 | 2022-05-21 | 中華電信股份有限公司 | System, method and computer-readable medium for translating chinese text into taiwanese or taiwanese pinyin |
CN112597778A (en) * | 2020-12-14 | 2021-04-02 | 华为技术有限公司 | Training method of translation model, translation method and translation equipment |
CN113836192A (en) * | 2021-08-13 | 2021-12-24 | 深译信息科技(横琴)有限公司 | Parallel corpus mining method and device, computer equipment and storage medium |
CN114997185A (en) * | 2021-10-27 | 2022-09-02 | 荣耀终端有限公司 | Translation method, medium, program product, and electronic device |
WO2023115770A1 (en) * | 2021-12-23 | 2023-06-29 | 科大讯飞股份有限公司 | Translation method and related device therefor |
CN114254657A (en) * | 2021-12-23 | 2022-03-29 | 科大讯飞股份有限公司 | Translation method and related equipment thereof |
CN116611458A (en) * | 2023-05-31 | 2023-08-18 | 本源量子计算科技(合肥)股份有限公司 | Text translation method and device, medium and electronic device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109271643A (en) | A kind of training method of translation model, interpretation method and device | |
CN110334361B (en) | Neural machine translation method for Chinese language | |
CN110929030B (en) | Text abstract and emotion classification combined training method | |
CN108829684A (en) | A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy | |
CN110309514B (en) | Semantic recognition method and device | |
CN111178094B (en) | Pre-training-based scarce resource neural machine translation training method | |
CN109492227A (en) | It is a kind of that understanding method is read based on the machine of bull attention mechanism and Dynamic iterations | |
WO2020107878A1 (en) | Method and apparatus for generating text summary, computer device and storage medium | |
CN111078866B (en) | Chinese text abstract generation method based on sequence-to-sequence model | |
CN109284506A (en) | A kind of user comment sentiment analysis system and method based on attention convolutional neural networks | |
CN110428820B (en) | Chinese and English mixed speech recognition method and device | |
CN110472252B (en) | Method for translating Hanyue neural machine based on transfer learning | |
CN109359297B (en) | Relationship extraction method and system | |
CN113283244B (en) | Pre-training model-based bidding data named entity identification method | |
CN112528637B (en) | Text processing model training method, device, computer equipment and storage medium | |
CN111143563A (en) | Text classification method based on integration of BERT, LSTM and CNN | |
CN108319666A (en) | A kind of electric service appraisal procedure based on multi-modal the analysis of public opinion | |
CN110069790A (en) | It is a kind of by translation retroversion to machine translation system and method literally | |
CN111858932A (en) | Multiple-feature Chinese and English emotion classification method and system based on Transformer | |
CN114757182A (en) | BERT short text sentiment analysis method for improving training mode | |
CN110688862A (en) | Mongolian-Chinese inter-translation method based on transfer learning | |
CN113657115B (en) | Multi-mode Mongolian emotion analysis method based on ironic recognition and fine granularity feature fusion | |
CN106683667A (en) | Automatic rhythm extracting method, system and application thereof in natural language processing | |
CN110427616A (en) | A kind of text emotion analysis method based on deep learning | |
CN110162789A (en) | A kind of vocabulary sign method and device based on the Chinese phonetic alphabet |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190125 |
|
RJ01 | Rejection of invention patent application after publication |