CN108829684A - A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy - Google Patents
A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy Download PDFInfo
- Publication number
- CN108829684A CN108829684A CN201810428618.6A CN201810428618A CN108829684A CN 108829684 A CN108829684 A CN 108829684A CN 201810428618 A CN201810428618 A CN 201810428618A CN 108829684 A CN108829684 A CN 108829684A
- Authority
- CN
- China
- Prior art keywords
- chinese
- machine translation
- translation
- model
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
Abstract
The present invention be for solve the problems, such as to cover at present Chinese machine translation translation quality it is low, translation effect difference propose.Mongol belongs to low-resource language, and the parallel bilingualism corpora of a large amount of illiteracy Chinese of collection is extremely difficult, and transfer learning strategy can be with this problem of effective solution in the present invention.Transfer learning strategy is the method solved with existing knowledge to different but related fields problem.It is trained firstly, being based on neural machine translation framework using large-scale Ying-Chinese parallel corpora;It is covered in Chinese nerve machine translation framework secondly, the trained translation model parameters weighting of large-scale Ying-Chinese parallel corpora is moved to, utilizes existing Meng-neural Machine Translation Model of Chinese parallel corpora training;Finally, by based on transfer learning strategy neural machine translation translation and statistical machine translation translation BLEU value and translation fluency compare and evaluated.By using control variate method, show that transfer learning strategy effectively increases and cover Chinese machine translation performance.
Description
Technical field
The invention belongs to neural machine translation mothod field, in particular to a kind of illiteracy Chinese nerve based on transfer learning strategy
Machine translation method.
Background technique
Machine translation, which refers to, is automatically converted a kind of natural language to identical meaning using machine (computer)
Another natural language process.In recent years with increased, the machine translation conduct breakthrough language barrier of international exchange
The important means hindered plays increasing effect in the production, life of people.Neural machine translation is as data-driven
One of machine translation of method, height rely on scale, the quality of parallel corpora data structure.Since neural network parameter is advised
Mould is huge, and only after training corpus has certain scale, neural machine translation just can be significantly beyond statistical machine translation
Translation quality.However, the illiteracy Chinese parallel corpora resource for being presently available for experiment is extremely limited, a large amount of illiteracy bilingual parallel languages of the Chinese are collected
It is extremely difficult that material library needs to expend a large amount of human and material resources.
Mongol machine translation research is started late and Mongol grammer complexity itself grinds illiteracy Chinese machine translation
It is relatively slow to study carefully progress, wherein covering Chinese parallel corpora data set scarcity is to hinder one that covers the research of Chinese machine translation not allowing to neglect
Depending on big problem.And the core concept of transfer learning is that the knowledge store that training originating task is obtained is got off, applied to it is new (no
Together, but close task) in task.Transfer learning strategy allows to borrow a large amount of existing flag datas to train network by its knowledge
It moves in the less model of flag data.
There are problems that Parallel Corpus scarcity is mentioned for low-resource language currently, having some neural machine translation mothods
Out.Since Meng-Chinese parallel corpora is deficient and Mongol grammer complexity itself makes translation translation quality unsatisfactory, translation
There are still serious Sparse phenomenons for process.The knowledge learnt is applied in close task by transfer learning strategy, is subtracted
The amount of training data of few application task, provides possibility to reach general artificial intelligence.It is moved compared to from the beginning trained neural network
Moving learning strategy may be implemented using the parameters weighting of trained network structure as pre-training, to accelerate translation model
It trains progress and promotes final translation translation quality.
Summary of the invention
In order to overcome the disadvantages of the above prior art, the present invention from alleviate cover Chinese machine translation there are Sparse Problem and
The angle for improving illiteracy Chinese machine translation translation quality is set out, and proposes a kind of simple and effective transfer learning for low-resource language
Strategy.Currently, in addition to Chinese and english languages possess a large amount of bilingual teaching mode resource, all generally existing parallel corpora of other language
The problem of library scarcity.A large amount of Ying-Chinese parallel corpora base resource training is obtained network parameter weight by the present invention, is moved to illiteracy
In Chinese nerve Machine Translation Model, Meng-Chinese Parallel Corpus training is recycled to obtain covering Chinese nerve translation model.To solve
Meng-Chinese Parallel Corpus deficiency problem reaches the target for being promoted and covering Chinese machine translation performance.
To achieve the goals above, the technical solution adopted by the present invention is that:
A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy, firstly, using large-scale English-Chinese parallel
Corpus carries out English-Chinese neural Machine Translation Model training;Chinese nerve is covered secondly, the network parameter weight that training is acquired is moved to
In Machine Translation Model;Then, it carries out covering the training of Chinese nerve Machine Translation Model using existing illiteracy Chinese parallel corpora, be based on
The illiteracy Chinese nerve Machine Translation Model of transfer learning strategy;Finally, real using the illiteracy Chinese nerve Machine Translation Model that training obtains
Now cover Chinese nerve machine translation.
Its specific steps can be described as follows:
01:The division and data prediction work of data set are carried out to Chinese and English corpus;Data set division refers to
It is divided into training set, verifying collection and test set, data prediction work includes Chinese word segmentation and English pretreatment;
02:It constructs RNN and recycles neural Machine Translation Model framework, including encoder and decoder;
03:English-Chinese neural Machine Translation Model training, benefit in model training are carried out using large-scale English-Chinese parallel corpora
Network parameter is adjusted and is optimized with stochastic gradient descent (SGD);
04:Trained English-Chinese neural Machine Translation Model network parameter weight is moved to and covers Chinese nerve machine translation mould
Type carries out parameter initialization to illiteracy Chinese neural network and replaces random initializtion;
06:Translation evaluation and test is carried out to test set using BLEU value.
Wherein, before carrying out model training, data preferably are carried out to English-Chinese parallel corpora and Meng Han parallel corpora base resource
Pretreatment.The data prediction is the beam worker to be done when carrying out neural Machine Translation Model training using bilingual parallel corporas
Make.
The data prediction using Stanford University's natural language laboratory open source software as tool, including:
1) participle operation is carried out to Chinese corpus using participle tool stanford-segmenter;
2) pretreatment operation English corpus is carried out to English corpus using English pretreating tool stanford-ner to carry out
Pretreatment operation and Chinese data word segmentation processing;
The pretreatment is based on condition random field (CRF) model, i.e., using maximum entropy model as the conditional probability of main source
Model, the model are one according to given input node, find the undirected graph model of the conditional probability of output node.CRF mould
Type is defined as G=(V, E), is a non-directed graph, and V is node set, is the set of stochastic variable Y, Y={ Yi| 1≤i≤m }, E
For undirected line set, marking unit, E={ Y are needed for m for inputting a sentencei-1,Yi| 1≤i≤m }, it is m-1 side
The linear chain of composition;
A given sequence a for needing to mark, the condition probability formula of corresponding flag sequence b are:
Wherein, ii is the subscript of sequence, and Z (a) is normalized function, λkAnd λι kIt is the parameter of model, k is meant that every
The feature quantity on side and corresponding node, fkAnd fι kIt is a binary feature function.
It is described nerve Machine Translation Model formula be:
Wherein,It is the parameter of model,It is nonlinear function, ynIt is current target language word, x is source language sentence, y<
N is the target language sentence generated, VyIt is object language term vector, D is target language vocabulary table, CsIt is original language or more
Literary vector, CtObject language context vector.
The network type of the nerve Machine Translation Model is RNN Recognition with Recurrent Neural Network, to biography before RNN Recognition with Recurrent Neural Network
It broadcasts in algorithm, for any one sequence index t, hides layer state h(t)By list entries x(t)Stratiform is hidden with previous moment
State h(t-1)It obtains:
h(t)=σ (Ux(t)+Wh(t-1)+b)
Wherein, σ is the activation primitive of Recognition with Recurrent Neural Network, and generally tanh, b are the biasing of linear relationship, sequence index
The output o of number t model(t)Be expressed as o(t)=Vh(t)+ d, finally in sequence index t, prediction output is
D is the biasing of output node, and U, V, W is the parameter matrix shared in Recognition with Recurrent Neural Network.
In the model training, encoder and decoder carry out joint training, and model formation is:
Wherein, θ is the parameter of model, and p is conditional probability function, (xn,yn) indicating bilingual training corpus, N is training sample
Quantity, using maximum likelihood estimation algorithm training sample.
The network parameter weight acquired using bilingual parallel corporas training neural network is that each node of neural network joins
The parameter matrix connect, the network parameter weight acquired using training carry out parameter initialization replacement at random to Chinese neural network is covered
Initialization realizes that the network parameter weight for acquiring training moves to and covers Chinese nerve Machine Translation Model.
It is described using cover Chinese parallel corpora carry out cover the Chinese nerve Machine Translation Model training when, it is English-Chinese and cover Chinese translation model
Parameter setting including dictionary size, term vector size, hidden layer size need it is consistent.
Further, by the translation translation and statistical machine translation translation that cover Chinese nerve machine translation prototype system with regard to BLEU
Value is compared and is evaluated, and is achieved the purpose that finally to improve and is covered Chinese machine translation performance.
The BLEU value is the tool for assessing machine translation translation quality, and score is higher to illustrate Machine Translation Model
Can be better, the formula of BLEU value is:
Wherein, wn=1/M, M are the group word numbers of translation and reference translation, and the upper limit value of M is 4, pnRepresent n-gram standard
True rate, BP represent the shorter penalty factor of translation:
BP=emin(1-r/h,0)
Wherein, h is the number of word in candidate translation, and r is and the immediate reference translation length of h length.
The transfer learning Policy Core thought is that the knowledge store that training originating task/domain is obtained is got off, and is applied to new
(different, but close task) task/domain in.The present invention is the illiteracy Chinese nerve machine translation method based on transfer learning strategy,
Research method belongs to the transfer learning method based on model.Source domain and aiming field are assumed in the transfer learning method based on model
There is a model parameter that can be shared, specific moving method is on the model use to aiming field learnt by source domain, further according to
Aiming field learns new model.
Compared with existing illiteracy Chinese machine translation method, the invention firstly uses large-scale english-chinese bilingual parallel corpora instructions
Translation model is got, while guaranteeing English-Chinese parallel corpora high quality, wide coverage rate;Secondly, being turned over according to machine between different language
The network parameter of English-Chinese translation model learning is moved to and is covered in Chinese Machine Translation Model by the correlation translated;Finally, using existing
It covers the training of Chinese parallel corpora and covers Chinese nerve machine translation, transfer learning strategy implementation method simple possible proposed by the present invention has
Effect alleviates machine translation Sparse Problem existing for low-resource language.
Detailed description of the invention
Fig. 1 is conventional machines study and transfer learning comparison diagram.
Fig. 2 is the flow chart for realizing the neural machine translation prototype system based on transfer learning strategy.
Specific embodiment
The embodiment that the present invention will be described in detail with reference to the accompanying drawings and examples.
The present invention is based on the illiteracy Chinese nerve machine translation method of transfer learning strategy, realize that process is as follows:
1, data prediction problem is carried out to corpus
Data prediction work includes Chinese word segmentation and English data prediction.It is tested using Stanford University's natural language
Room open source software participle tool stanford-segmenter carries out participle operation to Chinese corpus;Utilize English pretreating tool
Stanford-ner carries out pretreatment operation to English corpus.Its basic functional principle is condition random field (CRF), i.e., with maximum
Entropy model is the conditional probability model of main source, which is one according to given input node, finds output node
The undirected graph model of conditional probability.CRF model is defined as G=(V, E), is a non-directed graph, and V is node set, and E is nonoriented edge
Set, wherein V is the set Y={ Y of stochastic variable Yi| 1≤i≤m }, the m needs for inputting a sentence mark single
Member, E={ Yi-1,Yi| 1≤i≤m } it is the linear chain that m-1 side is constituted.
A given sequence a for needing to mark, the condition probability formula of corresponding flag sequence b are:
Wherein, ii is the subscript of sequence, and Z (a) is normalized function, λkAnd λι kIt is the parameter of model, k is defined in each
The feature quantity on side and corresponding node, fkAnd fι kIt is a binary feature function.
2, statistical machine translation and neural machine translation modeling problem
A. statistical machine translation model describes:The key problem of statistical machine translation is exactly to use statistical method from bilingual corpora
In learn translation model automatically, be then based on this translation model, to source language sentence from translation Candidate Set in find one scoring
Highest target sentences are as best translation translation.According to noise channel simulated target language T as the defeated of noisy channel model
Enter, after noisy communication channel encodes, corresponding sequence will be exported, this sequence is original language S.And statistical machine translation
Target will exactly obtain corresponding object language T according to original language S Gray code, this process is otherwise known as decoding or translation.System
Count Machine Translation Model formula:
ArgmaxPr (T | S)=argmaxPr (S | T) Pr (T) (2)
Wherein, Pr (T) indicates the language model of object language, and Pr (S | T) indicates bilingual translation model, the formula
The referred to as fundamental equation of statistical machine translation.
B. neural Machine Translation Model description:Neural machine translation is a kind of to directly acquire natural language using neural network
Between mapping relations machine translation method.The Nonlinear Mapping of neural machine translation (NMT) is different from linear statistical machine
Device translates (SMT) model, and neural machine translation describes bilingual semanteme using the state vector of connection encoder and decoder
Equivalence relation.Neural machine translation method based on deep learning becomes presently more than traditional statistical machine translation method
New mainstream technology.Key problem using the mapping (i.e. machine translation) of neural fusion natural language is that conditional probability is built
Mould, neural machine translation model formula:
Wherein,It is the parameter of model,It is nonlinear function, ynIt is current target language word, x is source language sentence, y<
N is the target language sentence generated, VyIt is object language term vector, D is target language vocabulary table, CsIt is original language or more
Literary vector, CtObject language context vector.
C. machine translation translation quality evaluation index, that is, BLEU value is the tool for assessing machine translation translation quality, point
Number is higher to illustrate that Machine Translation Model performance is better.The formula of BLEU value is:
Wherein, wn=1/M, M are the group word numbers of translation and reference translation, and the upper limit value of M is 4, pnRepresent n-gram standard
True rate, BP represent the shorter penalty factor of translation:
BP=emin(1-r/h,0) (7)
Wherein, h is the number of word in candidate translation, and r is and the immediate reference translation length of h length.
3, it is based on Recognition with Recurrent Neural Network (RNN) coder-decoder framework problem
Recognition with Recurrent Neural Network is more good at for traditional neural network for holding the relationship between context,
Therefore it is commonly used in the inter-related task of natural language processing.The next word for wanting prediction sentence, needs to use under normal circumstances
The word that front occurs into sentence, because front and back word is not independent in a sentence.It is current in Recognition with Recurrent Neural Network
Output depend on the output of current input and front, RNN is the neural network with certain memory function.Coder-decoder
Model (Encoder-Decoder) is one of neural network machine translation model, and encoder reads source language sentence, encoder
Main task is that source language sentence is encoded into the fixed real vector of dimension, which represents original language semantic information;Solution
The real vector for representing original language semantic information is read in code device part, then sequentially generates corresponding target language term sequence,
The end of translation process is indicated until encountering a tail end mark.
A. encoder reads neural network and inputs x=(x1,x2,…,xl), it is encoded to hidden state h=(h1,
h2,…,hl), encoder generallys use Recognition with Recurrent Neural Network (RNN) to realize, more new formula:
hi=f (xi,hi-1) (5)
C=q ({ h1,...,hl}) (6)
Wherein, c is source language sentence subrepresentation, and f and q are nonlinear functions.
B. c and forerunner's output sequence { y is indicated in given source language sentence1,y2,…,yt-1Under the conditions of, decoder is successively given birth to
Vocabulary y is corresponded at object languaget, model formation:
Y=(y1,y2,…,yT), decoder usually equally uses Recognition with Recurrent Neural Network, form:
p(yt|{y1,...,yt-1, c)=g (yt-1,st,c) (8)
G is that nonlinear function is used to calculate ytProbability, stIt is the hidden state of decoder.
4, neural network propagated forward algorithm and translation model training problem
A. in Recognition with Recurrent Neural Network training process in propagated forward algorithm, for any one sequence index t, hidden layer
State h(t)By list entries x(t)Layer state h is hidden with previous moment(t-1)It obtains:
h(t)=σ (Ux(t)+Wh(t-1)+b)
Wherein, σ is the activation primitive of Recognition with Recurrent Neural Network, and generally tanh, b are the biasing of linear relationship, sequence index
The output o of number t model(t)Be expressed as o(t)=Vh(t)+ d, finally in sequence index t, prediction output isd
For the biasing of output node, U, V, W is the parameter matrix shared in Recognition with Recurrent Neural Network.
B. Parallel Corpus is given, the more common training method of neural machine translation is Maximum-likelihood estimation, the present invention
Middle neural metwork training carries out joint training using encoder and decoder, and model training formula is:
Wherein, θ is the parameter of model, and p is conditional probability function, (xn,yn) indicating bilingual training corpus, N is training sample
Quantity, using maximum likelihood estimation algorithm training sample.
5, attention mechanism problem
Neural machine translation initially translate effect be not it is highly desirable, be not above the machine translation based on statistical method
Quality.As the end-to-end coder-decoder frame for machine translation proposes and attention mechanism is introduced into nerve
In machine translation frame, so that the performance of neural machine translation is significantly improved and neural machine translation box has gradually been determined
The main composition framework of frame.General neural network translation model by source language sentence be expressed as the real number of a fixed dimension to
Amount, this method Shortcomings place, such as fixed-size vector can not give full expression to out source language sentence semantic information.It will
Attention mechanism is added in neural Machine Translation Model, when generating target language term, is sought by attention mechanism dynamic
Source language term information relevant to the word is generated is looked for, so that the ability to express of neural network machine translation model is enhanced, and
And translation effect is significantly improved in related experiment.When using attention mechanism, formula 8 is newly defined as:
p(yt|{y1,...,yt-1, x)=g (yt-1,st,ci) (9)
stIt is the hidden state of t moment Recognition with Recurrent Neural Network, is obtained by following formula:
st=f (st-1,yt-1,ct) (10)
G, f is nonlinear function, context vector (Context Vector) ctDependent on source-language coding's sequence (h1,
h2,…,hl), hiInclude i-th of input word contextual information.ctThe following formula of calculation method:
atjIt is hjWeight, the following formula of calculation method:
Wherein, etj=a (st-1,hj) it is alignment model, calculate the matching journey that t moment generates word and j-th of original language word
Degree.It is translated compared to common neural network machine, this method has merged more original language end information in decoding, can be significant
Hoisting machine translates effect.
6, transfer learning strategy
Come on the basis of giving and training up data in traditional machine learning (Machine Learning) model
Learn a model, is then classified to document and predicted using the model learnt.Transfer learning (Transfer
Learning target) is that the knowledge acquired from some field or task is used to help the study of inter-related task.In attached drawing 1
In compared the difference of traditional machine learning and transfer learning.
The thought of transfer learning is that the knowledge store that training originating task is obtained is got off, and is applied in close task.It uses
The pre-training model of training on large data sets, directly uses corresponding structure and weight, applies it on target problem, i.e.,
By the model " migration " of pre-training into target problem.How pre-training model is used, is by between source domain and aiming field data set
The similarities of data, size determine.
The application method of pre-training model in the case of 1 four kinds of table
Relationship between source domain and aiming field | Model training method |
Data set is small, similarity is high | Pretreated model is as pattern extractor |
Data set is small, similarity is low | Freeze k layers of weight before pre-training model, the subsequent layer of re -training |
Data set is big, similarity is low | Pretreated model weights initialisation, is trained using new data set |
Data set is small, similarity is high | Weights initialisation is trained using new data set |
With reference to Fig. 2, the present invention is based on the neural machine translation prototype system of transfer learning strategy specific implementation steps to retouch
It states as follows:
01:The division and data prediction work of data set are carried out to Chinese and English corpus;Data set division refers to
It is divided into training set, verifying collection and test set, data prediction work includes Chinese word segmentation and English pretreatment;
02:It constructs RNN and recycles neural Machine Translation Model framework, including encoder and decoder;
03:English-Chinese neural Machine Translation Model training, benefit in model training are carried out using large-scale English-Chinese parallel corpora
Network parameter is adjusted and is optimized with stochastic gradient descent (SGD);
04:Trained English-Chinese neural Machine Translation Model network parameter weight is moved to and covers Chinese nerve machine translation mould
Type carries out parameter initialization to illiteracy Chinese neural network and replaces random initializtion;
06:Translation evaluation and test is carried out to test set using BLEU value.
To keep illiteracy Chinese translation flow of the invention clearer, a Mongolian is put up with below and is made to Chinese sentence translation process
It is described in further detail.
To Mongol sentenceIt is as follows to carry out translation process:
01:The real vector of Mongol sentence boil down to fixed dimension, the vector are represent source language sentence by encoder
Semantic information;
02:By the vector inversely decoding at corresponding target language sentence, attention mechanism generates decoder in decoder
Dynamic finds original language context relevant with current word when each target language words, such as when generating Chinese word " work ",
Mongolian wordIt is most related therewith;
03:Translation translation is evaluated and tested with regard to BLEU value;
04:Obtaining complete Chinese translation translation, " this work is completed us and is needed for a long time.".
Claims (10)
1. a kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy, which is characterized in that firstly, using large-scale
English-Chinese parallel corpora carries out English-Chinese neural Machine Translation Model training;Secondly, the network parameter weight that training is acquired is moved to
It covers in Chinese nerve Machine Translation Model;Then, it carries out covering the training of Chinese nerve Machine Translation Model using existing illiteracy Chinese parallel corpora,
Obtain the illiteracy Chinese nerve Machine Translation Model based on transfer learning strategy;Finally, being turned over using the illiteracy Chinese nerve machine that training obtains
It translates model realization and covers Chinese nerve machine translation.
2. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 1, which is characterized in that into
Before row model training, data prediction is carried out to English-Chinese parallel corpora and Meng Han parallel corpora base resource.
3. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 2, which is characterized in that described
Data prediction using Stanford University's natural language laboratory open source software as tool, including:
1) participle operation is carried out to Chinese corpus using participle tool stanford-segmenter;
2) pretreatment operation English corpus is carried out to English corpus using English pretreating tool stanford-ner to be located in advance
Reason operation and Chinese data word segmentation processing;
The pretreatment is based on condition random field (CRF) model, and it is a non-directed graph that CRF model, which is defined as G=(V, E), and V is
Node set is the set of stochastic variable Y, Y={ Yi| 1≤i≤m }, E is undirected line set, for inputting the m of a sentence
It is a to need marking unit, E={ Yi-1,Yi| 1≤i≤m }, it is the linear chain that m-1 side is constituted;
A given sequence a for needing to mark, the condition probability formula of corresponding flag sequence b are:
Wherein, ii is the subscript of sequence, and Z (a) is normalized function, λkAnd λι kThe parameter of model, k be meant that each edge and
The feature quantity of corresponding node, fkAnd fι kIt is a binary feature function.
4. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 1, which is characterized in that described
Neural Machine Translation Model formula is:
Wherein,It is the parameter of model,It is nonlinear function, ynIt is current target language word, x is source language sentence, y<N is
Target language sentence through generating, VyIt is object language term vector, D is target language vocabulary table, CsOriginal language context to
Amount, CtObject language context vector.
5. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 1, which is characterized in that described
The network type of neural Machine Translation Model is RNN Recognition with Recurrent Neural Network, right in RNN Recognition with Recurrent Neural Network propagated forward algorithm
In any one sequence index t, layer state h is hidden(t)By list entries x(t)Layer state h is hidden with previous moment(t-1)It obtains:
h(t)=σ (Ux(t)+Wh(t-1)+b)
Wherein, σ is the activation primitive of Recognition with Recurrent Neural Network, and generally tanh, b are the biasing of linear relationship, sequence index t mould
The output o of type(t)Be expressed as o(t)=Vh(t)+ d, finally in sequence index t, prediction output isD is defeated
The biasing of node out, U, V, W are the parameter matrixs shared in Recognition with Recurrent Neural Network.
6. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 1, which is characterized in that described
In model training, encoder and decoder carry out joint training, and model formation is:
Wherein, θ is the parameter of model, and p is conditional probability function, (xn,yn) indicating bilingual training corpus, N is number of training
Amount, using maximum likelihood estimation algorithm training sample.
7. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 1, which is characterized in that described
It is that neural network respectively ties point-connected parameter matrix using the network parameter weight that bilingual parallel corporas training neural network is acquired,
The network parameter weight acquired using training carries out parameter initialization instead of random initializtion to Chinese neural network is covered, and realizing will
The network parameter weight that training is acquired, which moves to, covers Chinese nerve Machine Translation Model.
8. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 1, which is characterized in that described
It is English-Chinese and illiteracy Chinese translation model big including dictionary when carrying out covering the training of Chinese nerve Machine Translation Model using illiteracy Chinese parallel corpora
Parameter setting including small, term vector size, hidden layer size needs consistent.
9. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 1, which is characterized in that will cover
The translation translation of Chinese nerve machine translation prototype system is compared and is evaluated with regard to BLEU value with statistical machine translation translation, is reached
It is final to improve the purpose for covering Chinese machine translation performance.
10. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 9, which is characterized in that institute
Stating BLEU value is the tool for assessing machine translation translation quality, and score is higher to illustrate that Machine Translation Model performance is better,
The formula of BLEU value is:
Wherein, wn=1/M, M are the group word numbers of translation and reference translation, and the upper limit value of M is 4, pnN-gram accuracy rate is represented,
BP represents the shorter penalty factor of translation:
BP=emin(1-r/h,0)
Wherein, h is the number of word in candidate translation, and r is and the immediate reference translation length of h length.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810428618.6A CN108829684A (en) | 2018-05-07 | 2018-05-07 | A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810428618.6A CN108829684A (en) | 2018-05-07 | 2018-05-07 | A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108829684A true CN108829684A (en) | 2018-11-16 |
Family
ID=64148400
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810428618.6A Pending CN108829684A (en) | 2018-05-07 | 2018-05-07 | A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108829684A (en) |
Cited By (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109408831A (en) * | 2018-10-11 | 2019-03-01 | 成都信息工程大学 | A kind of remote supervisory method of Chinese medicine fine granularity syndrome name segmentation |
CN109670180A (en) * | 2018-12-21 | 2019-04-23 | 语联网(武汉)信息技术有限公司 | The method and device of the translation personal characteristics of vectorization interpreter |
CN109740169A (en) * | 2019-01-09 | 2019-05-10 | 北京邮电大学 | A kind of Chinese medical book interpretation method based on dictionary and seq2seq pre-training mechanism |
CN110059802A (en) * | 2019-03-29 | 2019-07-26 | 阿里巴巴集团控股有限公司 | For training the method, apparatus of learning model and calculating equipment |
CN110083842A (en) * | 2019-03-27 | 2019-08-02 | 华为技术有限公司 | Translation quality detection method, device, machine translation system and storage medium |
CN110110061A (en) * | 2019-04-26 | 2019-08-09 | 同济大学 | Low-resource languages entity abstracting method based on bilingual term vector |
CN110188331A (en) * | 2019-06-03 | 2019-08-30 | 腾讯科技(深圳)有限公司 | Model training method, conversational system evaluation method, device, equipment and storage medium |
CN110210536A (en) * | 2019-05-22 | 2019-09-06 | 北京邮电大学 | A kind of the physical damnification diagnostic method and device of optical interconnection system |
CN110245364A (en) * | 2019-06-24 | 2019-09-17 | 中国科学技术大学 | The multi-modal neural machine translation method of zero parallel corpora |
CN110334362A (en) * | 2019-07-12 | 2019-10-15 | 北京百奥知信息科技有限公司 | A method of the solution based on medical nerve machine translation generates untranslated word |
CN110457719A (en) * | 2019-10-08 | 2019-11-15 | 北京金山数字娱乐科技有限公司 | A kind of method and device of translation model result reordering |
CN110472253A (en) * | 2019-08-15 | 2019-11-19 | 哈尔滨工业大学 | A kind of Sentence-level mechanical translation quality estimation model training method based on combination grain |
CN110472252A (en) * | 2019-08-15 | 2019-11-19 | 昆明理工大学 | The method of the more neural machine translation of the Chinese based on transfer learning |
CN110674646A (en) * | 2019-09-06 | 2020-01-10 | 内蒙古工业大学 | Mongolian Chinese machine translation system based on byte pair encoding technology |
CN110674648A (en) * | 2019-09-29 | 2020-01-10 | 厦门大学 | Neural network machine translation model based on iterative bidirectional migration |
CN110688862A (en) * | 2019-08-29 | 2020-01-14 | 内蒙古工业大学 | Mongolian-Chinese inter-translation method based on transfer learning |
CN111008533A (en) * | 2019-12-09 | 2020-04-14 | 北京字节跳动网络技术有限公司 | Method, device, equipment and storage medium for obtaining translation model |
CN111046677A (en) * | 2019-12-09 | 2020-04-21 | 北京字节跳动网络技术有限公司 | Method, device, equipment and storage medium for obtaining translation model |
CN111144140A (en) * | 2019-12-23 | 2020-05-12 | 语联网(武汉)信息技术有限公司 | Zero-learning-based Chinese and Tai bilingual corpus generation method and device |
CN112016604A (en) * | 2020-08-19 | 2020-12-01 | 华东师范大学 | Zero-resource machine translation method applying visual information |
CN112101047A (en) * | 2020-08-07 | 2020-12-18 | 江苏金陵科技集团有限公司 | Machine translation method for matching language-oriented precise terms |
CN112633018A (en) * | 2020-12-28 | 2021-04-09 | 内蒙古工业大学 | Mongolian Chinese neural machine translation method based on data enhancement |
WO2021109679A1 (en) * | 2019-12-06 | 2021-06-10 | 中兴通讯股份有限公司 | Method for constructing machine translation model, translation apparatus and computer readable storage medium |
CN112989848A (en) * | 2021-03-29 | 2021-06-18 | 华南理工大学 | Training method for neural machine translation model of field adaptive medical literature |
CN113033218A (en) * | 2021-04-16 | 2021-06-25 | 沈阳雅译网络技术有限公司 | Machine translation quality evaluation method based on neural network structure search |
CN113239708A (en) * | 2021-04-28 | 2021-08-10 | 华为技术有限公司 | Model training method, translation method and translation device |
CN113408302A (en) * | 2021-06-30 | 2021-09-17 | 澳门大学 | Method, device, equipment and storage medium for evaluating machine translation result |
CN113642341A (en) * | 2021-06-30 | 2021-11-12 | 深译信息科技(横琴)有限公司 | Deep confrontation generation method for solving scarcity of medical text data |
CN113657128A (en) * | 2021-08-25 | 2021-11-16 | 四川大学 | Learning translation system and storage medium based on importance measurement and low resource migration |
CN113822078A (en) * | 2021-08-20 | 2021-12-21 | 北京中科凡语科技有限公司 | XLM-R model fused machine translation model training method |
WO2022116819A1 (en) * | 2020-12-04 | 2022-06-09 | 北京有竹居网络技术有限公司 | Model training method and apparatus, machine translation method and apparatus, and device and storage medium |
CN115270826A (en) * | 2022-09-30 | 2022-11-01 | 北京澜舟科技有限公司 | Multilingual translation model construction method, translation method and computer storage medium |
-
2018
- 2018-05-07 CN CN201810428618.6A patent/CN108829684A/en active Pending
Non-Patent Citations (4)
Title |
---|
李亚超 等: ""藏汉神经网络机器翻译研究"", 《中文信息学报》 * |
杜建: ""融合统计机器翻译特征的蒙汉神经网络机器翻译技术"", 《中国优秀硕士学位论文全文库 信息科技辑》 * |
苏依拉 等: ""基于统计分析的蒙汉自然语言的机器翻译"", 《北京工业大学学报》 * |
赵伟: ""条件随机场在蒙古语词切分中的应用"", 《中国优秀硕士学位论文全文库 信息科技辑》 * |
Cited By (51)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109408831B (en) * | 2018-10-11 | 2020-02-21 | 成都信息工程大学 | Remote supervision method for traditional Chinese medicine fine-grained syndrome name segmentation |
CN109408831A (en) * | 2018-10-11 | 2019-03-01 | 成都信息工程大学 | A kind of remote supervisory method of Chinese medicine fine granularity syndrome name segmentation |
CN109670180A (en) * | 2018-12-21 | 2019-04-23 | 语联网(武汉)信息技术有限公司 | The method and device of the translation personal characteristics of vectorization interpreter |
CN109670180B (en) * | 2018-12-21 | 2020-05-08 | 语联网(武汉)信息技术有限公司 | Method and device for translating individual characteristics of vectorized translator |
CN109740169A (en) * | 2019-01-09 | 2019-05-10 | 北京邮电大学 | A kind of Chinese medical book interpretation method based on dictionary and seq2seq pre-training mechanism |
CN109740169B (en) * | 2019-01-09 | 2020-10-13 | 北京邮电大学 | Traditional Chinese medicine ancient book translation method based on dictionary and seq2seq pre-training mechanism |
CN110083842A (en) * | 2019-03-27 | 2019-08-02 | 华为技术有限公司 | Translation quality detection method, device, machine translation system and storage medium |
CN110083842B (en) * | 2019-03-27 | 2023-10-03 | 华为技术有限公司 | Translation quality detection method, device, machine translation system and storage medium |
CN110059802A (en) * | 2019-03-29 | 2019-07-26 | 阿里巴巴集团控股有限公司 | For training the method, apparatus of learning model and calculating equipment |
US11514368B2 (en) | 2019-03-29 | 2022-11-29 | Advanced New Technologies Co., Ltd. | Methods, apparatuses, and computing devices for trainings of learning models |
CN110110061B (en) * | 2019-04-26 | 2023-04-18 | 同济大学 | Low-resource language entity extraction method based on bilingual word vectors |
CN110110061A (en) * | 2019-04-26 | 2019-08-09 | 同济大学 | Low-resource languages entity abstracting method based on bilingual term vector |
CN110210536A (en) * | 2019-05-22 | 2019-09-06 | 北京邮电大学 | A kind of the physical damnification diagnostic method and device of optical interconnection system |
CN110188331A (en) * | 2019-06-03 | 2019-08-30 | 腾讯科技(深圳)有限公司 | Model training method, conversational system evaluation method, device, equipment and storage medium |
CN110188331B (en) * | 2019-06-03 | 2023-05-26 | 腾讯科技(深圳)有限公司 | Model training method, dialogue system evaluation method, device, equipment and storage medium |
CN110245364A (en) * | 2019-06-24 | 2019-09-17 | 中国科学技术大学 | The multi-modal neural machine translation method of zero parallel corpora |
CN110245364B (en) * | 2019-06-24 | 2022-10-28 | 中国科学技术大学 | Zero-parallel corpus multi-modal neural machine translation method |
CN110334362B (en) * | 2019-07-12 | 2023-04-07 | 北京百奥知信息科技有限公司 | Method for solving and generating untranslated words based on medical neural machine translation |
CN110334362A (en) * | 2019-07-12 | 2019-10-15 | 北京百奥知信息科技有限公司 | A method of the solution based on medical nerve machine translation generates untranslated word |
CN110472252B (en) * | 2019-08-15 | 2022-12-13 | 昆明理工大学 | Method for translating Hanyue neural machine based on transfer learning |
CN110472252A (en) * | 2019-08-15 | 2019-11-19 | 昆明理工大学 | The method of the more neural machine translation of the Chinese based on transfer learning |
CN110472253A (en) * | 2019-08-15 | 2019-11-19 | 哈尔滨工业大学 | A kind of Sentence-level mechanical translation quality estimation model training method based on combination grain |
CN110472253B (en) * | 2019-08-15 | 2022-10-25 | 哈尔滨工业大学 | Sentence-level machine translation quality estimation model training method based on mixed granularity |
CN110688862A (en) * | 2019-08-29 | 2020-01-14 | 内蒙古工业大学 | Mongolian-Chinese inter-translation method based on transfer learning |
CN110674646A (en) * | 2019-09-06 | 2020-01-10 | 内蒙古工业大学 | Mongolian Chinese machine translation system based on byte pair encoding technology |
CN110674648A (en) * | 2019-09-29 | 2020-01-10 | 厦门大学 | Neural network machine translation model based on iterative bidirectional migration |
CN110674648B (en) * | 2019-09-29 | 2021-04-27 | 厦门大学 | Neural network machine translation model based on iterative bidirectional migration |
CN110457719B (en) * | 2019-10-08 | 2020-01-07 | 北京金山数字娱乐科技有限公司 | Translation model result reordering method and device |
CN110457719A (en) * | 2019-10-08 | 2019-11-15 | 北京金山数字娱乐科技有限公司 | A kind of method and device of translation model result reordering |
WO2021109679A1 (en) * | 2019-12-06 | 2021-06-10 | 中兴通讯股份有限公司 | Method for constructing machine translation model, translation apparatus and computer readable storage medium |
CN111046677B (en) * | 2019-12-09 | 2021-07-20 | 北京字节跳动网络技术有限公司 | Method, device, equipment and storage medium for obtaining translation model |
CN111046677A (en) * | 2019-12-09 | 2020-04-21 | 北京字节跳动网络技术有限公司 | Method, device, equipment and storage medium for obtaining translation model |
CN111008533A (en) * | 2019-12-09 | 2020-04-14 | 北京字节跳动网络技术有限公司 | Method, device, equipment and storage medium for obtaining translation model |
CN111144140B (en) * | 2019-12-23 | 2023-07-04 | 语联网(武汉)信息技术有限公司 | Zhongtai bilingual corpus generation method and device based on zero-order learning |
CN111144140A (en) * | 2019-12-23 | 2020-05-12 | 语联网(武汉)信息技术有限公司 | Zero-learning-based Chinese and Tai bilingual corpus generation method and device |
CN112101047A (en) * | 2020-08-07 | 2020-12-18 | 江苏金陵科技集团有限公司 | Machine translation method for matching language-oriented precise terms |
CN112016604A (en) * | 2020-08-19 | 2020-12-01 | 华东师范大学 | Zero-resource machine translation method applying visual information |
WO2022116819A1 (en) * | 2020-12-04 | 2022-06-09 | 北京有竹居网络技术有限公司 | Model training method and apparatus, machine translation method and apparatus, and device and storage medium |
CN112633018B (en) * | 2020-12-28 | 2022-04-15 | 内蒙古工业大学 | Mongolian Chinese neural machine translation method based on data enhancement |
CN112633018A (en) * | 2020-12-28 | 2021-04-09 | 内蒙古工业大学 | Mongolian Chinese neural machine translation method based on data enhancement |
CN112989848A (en) * | 2021-03-29 | 2021-06-18 | 华南理工大学 | Training method for neural machine translation model of field adaptive medical literature |
CN113033218B (en) * | 2021-04-16 | 2023-08-15 | 沈阳雅译网络技术有限公司 | Machine translation quality evaluation method based on neural network structure search |
CN113033218A (en) * | 2021-04-16 | 2021-06-25 | 沈阳雅译网络技术有限公司 | Machine translation quality evaluation method based on neural network structure search |
CN113239708A (en) * | 2021-04-28 | 2021-08-10 | 华为技术有限公司 | Model training method, translation method and translation device |
CN113642341A (en) * | 2021-06-30 | 2021-11-12 | 深译信息科技(横琴)有限公司 | Deep confrontation generation method for solving scarcity of medical text data |
CN113408302A (en) * | 2021-06-30 | 2021-09-17 | 澳门大学 | Method, device, equipment and storage medium for evaluating machine translation result |
CN113822078A (en) * | 2021-08-20 | 2021-12-21 | 北京中科凡语科技有限公司 | XLM-R model fused machine translation model training method |
CN113822078B (en) * | 2021-08-20 | 2023-09-08 | 北京中科凡语科技有限公司 | Training method of machine translation model fused with XLM-R model |
CN113657128B (en) * | 2021-08-25 | 2023-04-07 | 四川大学 | Learning translation system and storage medium based on importance measurement and low resource migration |
CN113657128A (en) * | 2021-08-25 | 2021-11-16 | 四川大学 | Learning translation system and storage medium based on importance measurement and low resource migration |
CN115270826A (en) * | 2022-09-30 | 2022-11-01 | 北京澜舟科技有限公司 | Multilingual translation model construction method, translation method and computer storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108829684A (en) | A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy | |
CN110334361B (en) | Neural machine translation method for Chinese language | |
CN106650813B (en) | A kind of image understanding method based on depth residual error network and LSTM | |
CN110348016B (en) | Text abstract generation method based on sentence correlation attention mechanism | |
CN106126507B (en) | A kind of depth nerve interpretation method and system based on character code | |
CN109376242B (en) | Text classification method based on cyclic neural network variant and convolutional neural network | |
CN109359294B (en) | Ancient Chinese translation method based on neural machine translation | |
CN109117483B (en) | Training method and device of neural network machine translation model | |
CN109635124A (en) | A kind of remote supervisory Relation extraction method of combination background knowledge | |
CN107967318A (en) | A kind of Chinese short text subjective item automatic scoring method and system using LSTM neutral nets | |
CN110688862A (en) | Mongolian-Chinese inter-translation method based on transfer learning | |
CN109359291A (en) | A kind of name entity recognition method | |
CN106569998A (en) | Text named entity recognition method based on Bi-LSTM, CNN and CRF | |
CN110134946B (en) | Machine reading understanding method for complex data | |
CN108897740A (en) | A kind of illiteracy Chinese machine translation method based on confrontation neural network | |
CN109977234A (en) | A kind of knowledge mapping complementing method based on subject key words filtering | |
CN110619127B (en) | Mongolian Chinese machine translation method based on neural network turing machine | |
CN113468895B (en) | Non-autoregressive neural machine translation method based on decoder input enhancement | |
CN108932232A (en) | A kind of illiteracy Chinese inter-translation method based on LSTM neural network | |
CN110909736A (en) | Image description method based on long-short term memory model and target detection algorithm | |
CN108765383A (en) | Video presentation method based on depth migration study | |
CN110162789A (en) | A kind of vocabulary sign method and device based on the Chinese phonetic alphabet | |
CN110837736B (en) | Named entity recognition method of Chinese medical record based on word structure | |
CN110442880B (en) | Translation method, device and storage medium for machine translation | |
CN111414770B (en) | Semi-supervised Mongolian neural machine translation method based on collaborative training |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181116 |