CN108829684A - A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy - Google Patents

A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy Download PDF

Info

Publication number
CN108829684A
CN108829684A CN201810428618.6A CN201810428618A CN108829684A CN 108829684 A CN108829684 A CN 108829684A CN 201810428618 A CN201810428618 A CN 201810428618A CN 108829684 A CN108829684 A CN 108829684A
Authority
CN
China
Prior art keywords
chinese
machine translation
translation
model
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810428618.6A
Other languages
Chinese (zh)
Inventor
苏依拉
赵亚平
牛向华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inner Mongolia University of Technology
Original Assignee
Inner Mongolia University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inner Mongolia University of Technology filed Critical Inner Mongolia University of Technology
Priority to CN201810428618.6A priority Critical patent/CN108829684A/en
Publication of CN108829684A publication Critical patent/CN108829684A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation

Abstract

The present invention be for solve the problems, such as to cover at present Chinese machine translation translation quality it is low, translation effect difference propose.Mongol belongs to low-resource language, and the parallel bilingualism corpora of a large amount of illiteracy Chinese of collection is extremely difficult, and transfer learning strategy can be with this problem of effective solution in the present invention.Transfer learning strategy is the method solved with existing knowledge to different but related fields problem.It is trained firstly, being based on neural machine translation framework using large-scale Ying-Chinese parallel corpora;It is covered in Chinese nerve machine translation framework secondly, the trained translation model parameters weighting of large-scale Ying-Chinese parallel corpora is moved to, utilizes existing Meng-neural Machine Translation Model of Chinese parallel corpora training;Finally, by based on transfer learning strategy neural machine translation translation and statistical machine translation translation BLEU value and translation fluency compare and evaluated.By using control variate method, show that transfer learning strategy effectively increases and cover Chinese machine translation performance.

Description

A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy
Technical field
The invention belongs to neural machine translation mothod field, in particular to a kind of illiteracy Chinese nerve based on transfer learning strategy Machine translation method.
Background technique
Machine translation, which refers to, is automatically converted a kind of natural language to identical meaning using machine (computer) Another natural language process.In recent years with increased, the machine translation conduct breakthrough language barrier of international exchange The important means hindered plays increasing effect in the production, life of people.Neural machine translation is as data-driven One of machine translation of method, height rely on scale, the quality of parallel corpora data structure.Since neural network parameter is advised Mould is huge, and only after training corpus has certain scale, neural machine translation just can be significantly beyond statistical machine translation Translation quality.However, the illiteracy Chinese parallel corpora resource for being presently available for experiment is extremely limited, a large amount of illiteracy bilingual parallel languages of the Chinese are collected It is extremely difficult that material library needs to expend a large amount of human and material resources.
Mongol machine translation research is started late and Mongol grammer complexity itself grinds illiteracy Chinese machine translation It is relatively slow to study carefully progress, wherein covering Chinese parallel corpora data set scarcity is to hinder one that covers the research of Chinese machine translation not allowing to neglect Depending on big problem.And the core concept of transfer learning is that the knowledge store that training originating task is obtained is got off, applied to it is new (no Together, but close task) in task.Transfer learning strategy allows to borrow a large amount of existing flag datas to train network by its knowledge It moves in the less model of flag data.
There are problems that Parallel Corpus scarcity is mentioned for low-resource language currently, having some neural machine translation mothods Out.Since Meng-Chinese parallel corpora is deficient and Mongol grammer complexity itself makes translation translation quality unsatisfactory, translation There are still serious Sparse phenomenons for process.The knowledge learnt is applied in close task by transfer learning strategy, is subtracted The amount of training data of few application task, provides possibility to reach general artificial intelligence.It is moved compared to from the beginning trained neural network Moving learning strategy may be implemented using the parameters weighting of trained network structure as pre-training, to accelerate translation model It trains progress and promotes final translation translation quality.
Summary of the invention
In order to overcome the disadvantages of the above prior art, the present invention from alleviate cover Chinese machine translation there are Sparse Problem and The angle for improving illiteracy Chinese machine translation translation quality is set out, and proposes a kind of simple and effective transfer learning for low-resource language Strategy.Currently, in addition to Chinese and english languages possess a large amount of bilingual teaching mode resource, all generally existing parallel corpora of other language The problem of library scarcity.A large amount of Ying-Chinese parallel corpora base resource training is obtained network parameter weight by the present invention, is moved to illiteracy In Chinese nerve Machine Translation Model, Meng-Chinese Parallel Corpus training is recycled to obtain covering Chinese nerve translation model.To solve Meng-Chinese Parallel Corpus deficiency problem reaches the target for being promoted and covering Chinese machine translation performance.
To achieve the goals above, the technical solution adopted by the present invention is that:
A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy, firstly, using large-scale English-Chinese parallel Corpus carries out English-Chinese neural Machine Translation Model training;Chinese nerve is covered secondly, the network parameter weight that training is acquired is moved to In Machine Translation Model;Then, it carries out covering the training of Chinese nerve Machine Translation Model using existing illiteracy Chinese parallel corpora, be based on The illiteracy Chinese nerve Machine Translation Model of transfer learning strategy;Finally, real using the illiteracy Chinese nerve Machine Translation Model that training obtains Now cover Chinese nerve machine translation.
Its specific steps can be described as follows:
01:The division and data prediction work of data set are carried out to Chinese and English corpus;Data set division refers to It is divided into training set, verifying collection and test set, data prediction work includes Chinese word segmentation and English pretreatment;
02:It constructs RNN and recycles neural Machine Translation Model framework, including encoder and decoder;
03:English-Chinese neural Machine Translation Model training, benefit in model training are carried out using large-scale English-Chinese parallel corpora Network parameter is adjusted and is optimized with stochastic gradient descent (SGD);
04:Trained English-Chinese neural Machine Translation Model network parameter weight is moved to and covers Chinese nerve machine translation mould Type carries out parameter initialization to illiteracy Chinese neural network and replaces random initializtion;
06:Translation evaluation and test is carried out to test set using BLEU value.
Wherein, before carrying out model training, data preferably are carried out to English-Chinese parallel corpora and Meng Han parallel corpora base resource Pretreatment.The data prediction is the beam worker to be done when carrying out neural Machine Translation Model training using bilingual parallel corporas Make.
The data prediction using Stanford University's natural language laboratory open source software as tool, including:
1) participle operation is carried out to Chinese corpus using participle tool stanford-segmenter;
2) pretreatment operation English corpus is carried out to English corpus using English pretreating tool stanford-ner to carry out Pretreatment operation and Chinese data word segmentation processing;
The pretreatment is based on condition random field (CRF) model, i.e., using maximum entropy model as the conditional probability of main source Model, the model are one according to given input node, find the undirected graph model of the conditional probability of output node.CRF mould Type is defined as G=(V, E), is a non-directed graph, and V is node set, is the set of stochastic variable Y, Y={ Yi| 1≤i≤m }, E For undirected line set, marking unit, E={ Y are needed for m for inputting a sentencei-1,Yi| 1≤i≤m }, it is m-1 side The linear chain of composition;
A given sequence a for needing to mark, the condition probability formula of corresponding flag sequence b are:
Wherein, ii is the subscript of sequence, and Z (a) is normalized function, λkAnd λι kIt is the parameter of model, k is meant that every The feature quantity on side and corresponding node, fkAnd fι kIt is a binary feature function.
It is described nerve Machine Translation Model formula be:
Wherein,It is the parameter of model,It is nonlinear function, ynIt is current target language word, x is source language sentence, y< N is the target language sentence generated, VyIt is object language term vector, D is target language vocabulary table, CsIt is original language or more Literary vector, CtObject language context vector.
The network type of the nerve Machine Translation Model is RNN Recognition with Recurrent Neural Network, to biography before RNN Recognition with Recurrent Neural Network It broadcasts in algorithm, for any one sequence index t, hides layer state h(t)By list entries x(t)Stratiform is hidden with previous moment State h(t-1)It obtains:
h(t)=σ (Ux(t)+Wh(t-1)+b)
Wherein, σ is the activation primitive of Recognition with Recurrent Neural Network, and generally tanh, b are the biasing of linear relationship, sequence index The output o of number t model(t)Be expressed as o(t)=Vh(t)+ d, finally in sequence index t, prediction output is D is the biasing of output node, and U, V, W is the parameter matrix shared in Recognition with Recurrent Neural Network.
In the model training, encoder and decoder carry out joint training, and model formation is:
Wherein, θ is the parameter of model, and p is conditional probability function, (xn,yn) indicating bilingual training corpus, N is training sample Quantity, using maximum likelihood estimation algorithm training sample.
The network parameter weight acquired using bilingual parallel corporas training neural network is that each node of neural network joins The parameter matrix connect, the network parameter weight acquired using training carry out parameter initialization replacement at random to Chinese neural network is covered Initialization realizes that the network parameter weight for acquiring training moves to and covers Chinese nerve Machine Translation Model.
It is described using cover Chinese parallel corpora carry out cover the Chinese nerve Machine Translation Model training when, it is English-Chinese and cover Chinese translation model Parameter setting including dictionary size, term vector size, hidden layer size need it is consistent.
Further, by the translation translation and statistical machine translation translation that cover Chinese nerve machine translation prototype system with regard to BLEU Value is compared and is evaluated, and is achieved the purpose that finally to improve and is covered Chinese machine translation performance.
The BLEU value is the tool for assessing machine translation translation quality, and score is higher to illustrate Machine Translation Model Can be better, the formula of BLEU value is:
Wherein, wn=1/M, M are the group word numbers of translation and reference translation, and the upper limit value of M is 4, pnRepresent n-gram standard True rate, BP represent the shorter penalty factor of translation:
BP=emin(1-r/h,0)
Wherein, h is the number of word in candidate translation, and r is and the immediate reference translation length of h length.
The transfer learning Policy Core thought is that the knowledge store that training originating task/domain is obtained is got off, and is applied to new (different, but close task) task/domain in.The present invention is the illiteracy Chinese nerve machine translation method based on transfer learning strategy, Research method belongs to the transfer learning method based on model.Source domain and aiming field are assumed in the transfer learning method based on model There is a model parameter that can be shared, specific moving method is on the model use to aiming field learnt by source domain, further according to Aiming field learns new model.
Compared with existing illiteracy Chinese machine translation method, the invention firstly uses large-scale english-chinese bilingual parallel corpora instructions Translation model is got, while guaranteeing English-Chinese parallel corpora high quality, wide coverage rate;Secondly, being turned over according to machine between different language The network parameter of English-Chinese translation model learning is moved to and is covered in Chinese Machine Translation Model by the correlation translated;Finally, using existing It covers the training of Chinese parallel corpora and covers Chinese nerve machine translation, transfer learning strategy implementation method simple possible proposed by the present invention has Effect alleviates machine translation Sparse Problem existing for low-resource language.
Detailed description of the invention
Fig. 1 is conventional machines study and transfer learning comparison diagram.
Fig. 2 is the flow chart for realizing the neural machine translation prototype system based on transfer learning strategy.
Specific embodiment
The embodiment that the present invention will be described in detail with reference to the accompanying drawings and examples.
The present invention is based on the illiteracy Chinese nerve machine translation method of transfer learning strategy, realize that process is as follows:
1, data prediction problem is carried out to corpus
Data prediction work includes Chinese word segmentation and English data prediction.It is tested using Stanford University's natural language Room open source software participle tool stanford-segmenter carries out participle operation to Chinese corpus;Utilize English pretreating tool Stanford-ner carries out pretreatment operation to English corpus.Its basic functional principle is condition random field (CRF), i.e., with maximum Entropy model is the conditional probability model of main source, which is one according to given input node, finds output node The undirected graph model of conditional probability.CRF model is defined as G=(V, E), is a non-directed graph, and V is node set, and E is nonoriented edge Set, wherein V is the set Y={ Y of stochastic variable Yi| 1≤i≤m }, the m needs for inputting a sentence mark single Member, E={ Yi-1,Yi| 1≤i≤m } it is the linear chain that m-1 side is constituted.
A given sequence a for needing to mark, the condition probability formula of corresponding flag sequence b are:
Wherein, ii is the subscript of sequence, and Z (a) is normalized function, λkAnd λι kIt is the parameter of model, k is defined in each The feature quantity on side and corresponding node, fkAnd fι kIt is a binary feature function.
2, statistical machine translation and neural machine translation modeling problem
A. statistical machine translation model describes:The key problem of statistical machine translation is exactly to use statistical method from bilingual corpora In learn translation model automatically, be then based on this translation model, to source language sentence from translation Candidate Set in find one scoring Highest target sentences are as best translation translation.According to noise channel simulated target language T as the defeated of noisy channel model Enter, after noisy communication channel encodes, corresponding sequence will be exported, this sequence is original language S.And statistical machine translation Target will exactly obtain corresponding object language T according to original language S Gray code, this process is otherwise known as decoding or translation.System Count Machine Translation Model formula:
ArgmaxPr (T | S)=argmaxPr (S | T) Pr (T) (2)
Wherein, Pr (T) indicates the language model of object language, and Pr (S | T) indicates bilingual translation model, the formula The referred to as fundamental equation of statistical machine translation.
B. neural Machine Translation Model description:Neural machine translation is a kind of to directly acquire natural language using neural network Between mapping relations machine translation method.The Nonlinear Mapping of neural machine translation (NMT) is different from linear statistical machine Device translates (SMT) model, and neural machine translation describes bilingual semanteme using the state vector of connection encoder and decoder Equivalence relation.Neural machine translation method based on deep learning becomes presently more than traditional statistical machine translation method New mainstream technology.Key problem using the mapping (i.e. machine translation) of neural fusion natural language is that conditional probability is built Mould, neural machine translation model formula:
Wherein,It is the parameter of model,It is nonlinear function, ynIt is current target language word, x is source language sentence, y< N is the target language sentence generated, VyIt is object language term vector, D is target language vocabulary table, CsIt is original language or more Literary vector, CtObject language context vector.
C. machine translation translation quality evaluation index, that is, BLEU value is the tool for assessing machine translation translation quality, point Number is higher to illustrate that Machine Translation Model performance is better.The formula of BLEU value is:
Wherein, wn=1/M, M are the group word numbers of translation and reference translation, and the upper limit value of M is 4, pnRepresent n-gram standard True rate, BP represent the shorter penalty factor of translation:
BP=emin(1-r/h,0) (7)
Wherein, h is the number of word in candidate translation, and r is and the immediate reference translation length of h length.
3, it is based on Recognition with Recurrent Neural Network (RNN) coder-decoder framework problem
Recognition with Recurrent Neural Network is more good at for traditional neural network for holding the relationship between context, Therefore it is commonly used in the inter-related task of natural language processing.The next word for wanting prediction sentence, needs to use under normal circumstances The word that front occurs into sentence, because front and back word is not independent in a sentence.It is current in Recognition with Recurrent Neural Network Output depend on the output of current input and front, RNN is the neural network with certain memory function.Coder-decoder Model (Encoder-Decoder) is one of neural network machine translation model, and encoder reads source language sentence, encoder Main task is that source language sentence is encoded into the fixed real vector of dimension, which represents original language semantic information;Solution The real vector for representing original language semantic information is read in code device part, then sequentially generates corresponding target language term sequence, The end of translation process is indicated until encountering a tail end mark.
A. encoder reads neural network and inputs x=(x1,x2,…,xl), it is encoded to hidden state h=(h1, h2,…,hl), encoder generallys use Recognition with Recurrent Neural Network (RNN) to realize, more new formula:
hi=f (xi,hi-1) (5)
C=q ({ h1,...,hl}) (6)
Wherein, c is source language sentence subrepresentation, and f and q are nonlinear functions.
B. c and forerunner's output sequence { y is indicated in given source language sentence1,y2,…,yt-1Under the conditions of, decoder is successively given birth to Vocabulary y is corresponded at object languaget, model formation:
Y=(y1,y2,…,yT), decoder usually equally uses Recognition with Recurrent Neural Network, form:
p(yt|{y1,...,yt-1, c)=g (yt-1,st,c) (8)
G is that nonlinear function is used to calculate ytProbability, stIt is the hidden state of decoder.
4, neural network propagated forward algorithm and translation model training problem
A. in Recognition with Recurrent Neural Network training process in propagated forward algorithm, for any one sequence index t, hidden layer State h(t)By list entries x(t)Layer state h is hidden with previous moment(t-1)It obtains:
h(t)=σ (Ux(t)+Wh(t-1)+b)
Wherein, σ is the activation primitive of Recognition with Recurrent Neural Network, and generally tanh, b are the biasing of linear relationship, sequence index The output o of number t model(t)Be expressed as o(t)=Vh(t)+ d, finally in sequence index t, prediction output isd For the biasing of output node, U, V, W is the parameter matrix shared in Recognition with Recurrent Neural Network.
B. Parallel Corpus is given, the more common training method of neural machine translation is Maximum-likelihood estimation, the present invention Middle neural metwork training carries out joint training using encoder and decoder, and model training formula is:
Wherein, θ is the parameter of model, and p is conditional probability function, (xn,yn) indicating bilingual training corpus, N is training sample Quantity, using maximum likelihood estimation algorithm training sample.
5, attention mechanism problem
Neural machine translation initially translate effect be not it is highly desirable, be not above the machine translation based on statistical method Quality.As the end-to-end coder-decoder frame for machine translation proposes and attention mechanism is introduced into nerve In machine translation frame, so that the performance of neural machine translation is significantly improved and neural machine translation box has gradually been determined The main composition framework of frame.General neural network translation model by source language sentence be expressed as the real number of a fixed dimension to Amount, this method Shortcomings place, such as fixed-size vector can not give full expression to out source language sentence semantic information.It will Attention mechanism is added in neural Machine Translation Model, when generating target language term, is sought by attention mechanism dynamic Source language term information relevant to the word is generated is looked for, so that the ability to express of neural network machine translation model is enhanced, and And translation effect is significantly improved in related experiment.When using attention mechanism, formula 8 is newly defined as:
p(yt|{y1,...,yt-1, x)=g (yt-1,st,ci) (9)
stIt is the hidden state of t moment Recognition with Recurrent Neural Network, is obtained by following formula:
st=f (st-1,yt-1,ct) (10)
G, f is nonlinear function, context vector (Context Vector) ctDependent on source-language coding's sequence (h1, h2,…,hl), hiInclude i-th of input word contextual information.ctThe following formula of calculation method:
atjIt is hjWeight, the following formula of calculation method:
Wherein, etj=a (st-1,hj) it is alignment model, calculate the matching journey that t moment generates word and j-th of original language word Degree.It is translated compared to common neural network machine, this method has merged more original language end information in decoding, can be significant Hoisting machine translates effect.
6, transfer learning strategy
Come on the basis of giving and training up data in traditional machine learning (Machine Learning) model Learn a model, is then classified to document and predicted using the model learnt.Transfer learning (Transfer Learning target) is that the knowledge acquired from some field or task is used to help the study of inter-related task.In attached drawing 1 In compared the difference of traditional machine learning and transfer learning.
The thought of transfer learning is that the knowledge store that training originating task is obtained is got off, and is applied in close task.It uses The pre-training model of training on large data sets, directly uses corresponding structure and weight, applies it on target problem, i.e., By the model " migration " of pre-training into target problem.How pre-training model is used, is by between source domain and aiming field data set The similarities of data, size determine.
The application method of pre-training model in the case of 1 four kinds of table
Relationship between source domain and aiming field Model training method
Data set is small, similarity is high Pretreated model is as pattern extractor
Data set is small, similarity is low Freeze k layers of weight before pre-training model, the subsequent layer of re -training
Data set is big, similarity is low Pretreated model weights initialisation, is trained using new data set
Data set is small, similarity is high Weights initialisation is trained using new data set
With reference to Fig. 2, the present invention is based on the neural machine translation prototype system of transfer learning strategy specific implementation steps to retouch It states as follows:
01:The division and data prediction work of data set are carried out to Chinese and English corpus;Data set division refers to It is divided into training set, verifying collection and test set, data prediction work includes Chinese word segmentation and English pretreatment;
02:It constructs RNN and recycles neural Machine Translation Model framework, including encoder and decoder;
03:English-Chinese neural Machine Translation Model training, benefit in model training are carried out using large-scale English-Chinese parallel corpora Network parameter is adjusted and is optimized with stochastic gradient descent (SGD);
04:Trained English-Chinese neural Machine Translation Model network parameter weight is moved to and covers Chinese nerve machine translation mould Type carries out parameter initialization to illiteracy Chinese neural network and replaces random initializtion;
06:Translation evaluation and test is carried out to test set using BLEU value.
To keep illiteracy Chinese translation flow of the invention clearer, a Mongolian is put up with below and is made to Chinese sentence translation process It is described in further detail.
To Mongol sentenceIt is as follows to carry out translation process:
01:The real vector of Mongol sentence boil down to fixed dimension, the vector are represent source language sentence by encoder Semantic information;
02:By the vector inversely decoding at corresponding target language sentence, attention mechanism generates decoder in decoder Dynamic finds original language context relevant with current word when each target language words, such as when generating Chinese word " work ", Mongolian wordIt is most related therewith;
03:Translation translation is evaluated and tested with regard to BLEU value;
04:Obtaining complete Chinese translation translation, " this work is completed us and is needed for a long time.".

Claims (10)

1. a kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy, which is characterized in that firstly, using large-scale English-Chinese parallel corpora carries out English-Chinese neural Machine Translation Model training;Secondly, the network parameter weight that training is acquired is moved to It covers in Chinese nerve Machine Translation Model;Then, it carries out covering the training of Chinese nerve Machine Translation Model using existing illiteracy Chinese parallel corpora, Obtain the illiteracy Chinese nerve Machine Translation Model based on transfer learning strategy;Finally, being turned over using the illiteracy Chinese nerve machine that training obtains It translates model realization and covers Chinese nerve machine translation.
2. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 1, which is characterized in that into Before row model training, data prediction is carried out to English-Chinese parallel corpora and Meng Han parallel corpora base resource.
3. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 2, which is characterized in that described Data prediction using Stanford University's natural language laboratory open source software as tool, including:
1) participle operation is carried out to Chinese corpus using participle tool stanford-segmenter;
2) pretreatment operation English corpus is carried out to English corpus using English pretreating tool stanford-ner to be located in advance Reason operation and Chinese data word segmentation processing;
The pretreatment is based on condition random field (CRF) model, and it is a non-directed graph that CRF model, which is defined as G=(V, E), and V is Node set is the set of stochastic variable Y, Y={ Yi| 1≤i≤m }, E is undirected line set, for inputting the m of a sentence It is a to need marking unit, E={ Yi-1,Yi| 1≤i≤m }, it is the linear chain that m-1 side is constituted;
A given sequence a for needing to mark, the condition probability formula of corresponding flag sequence b are:
Wherein, ii is the subscript of sequence, and Z (a) is normalized function, λkAnd λι kThe parameter of model, k be meant that each edge and The feature quantity of corresponding node, fkAnd fι kIt is a binary feature function.
4. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 1, which is characterized in that described Neural Machine Translation Model formula is:
Wherein,It is the parameter of model,It is nonlinear function, ynIt is current target language word, x is source language sentence, y<N is Target language sentence through generating, VyIt is object language term vector, D is target language vocabulary table, CsOriginal language context to Amount, CtObject language context vector.
5. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 1, which is characterized in that described The network type of neural Machine Translation Model is RNN Recognition with Recurrent Neural Network, right in RNN Recognition with Recurrent Neural Network propagated forward algorithm In any one sequence index t, layer state h is hidden(t)By list entries x(t)Layer state h is hidden with previous moment(t-1)It obtains:
h(t)=σ (Ux(t)+Wh(t-1)+b)
Wherein, σ is the activation primitive of Recognition with Recurrent Neural Network, and generally tanh, b are the biasing of linear relationship, sequence index t mould The output o of type(t)Be expressed as o(t)=Vh(t)+ d, finally in sequence index t, prediction output isD is defeated The biasing of node out, U, V, W are the parameter matrixs shared in Recognition with Recurrent Neural Network.
6. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 1, which is characterized in that described In model training, encoder and decoder carry out joint training, and model formation is:
Wherein, θ is the parameter of model, and p is conditional probability function, (xn,yn) indicating bilingual training corpus, N is number of training Amount, using maximum likelihood estimation algorithm training sample.
7. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 1, which is characterized in that described It is that neural network respectively ties point-connected parameter matrix using the network parameter weight that bilingual parallel corporas training neural network is acquired, The network parameter weight acquired using training carries out parameter initialization instead of random initializtion to Chinese neural network is covered, and realizing will The network parameter weight that training is acquired, which moves to, covers Chinese nerve Machine Translation Model.
8. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 1, which is characterized in that described It is English-Chinese and illiteracy Chinese translation model big including dictionary when carrying out covering the training of Chinese nerve Machine Translation Model using illiteracy Chinese parallel corpora Parameter setting including small, term vector size, hidden layer size needs consistent.
9. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 1, which is characterized in that will cover The translation translation of Chinese nerve machine translation prototype system is compared and is evaluated with regard to BLEU value with statistical machine translation translation, is reached It is final to improve the purpose for covering Chinese machine translation performance.
10. the illiteracy Chinese nerve machine translation method based on transfer learning strategy according to claim 9, which is characterized in that institute Stating BLEU value is the tool for assessing machine translation translation quality, and score is higher to illustrate that Machine Translation Model performance is better, The formula of BLEU value is:
Wherein, wn=1/M, M are the group word numbers of translation and reference translation, and the upper limit value of M is 4, pnN-gram accuracy rate is represented, BP represents the shorter penalty factor of translation:
BP=emin(1-r/h,0)
Wherein, h is the number of word in candidate translation, and r is and the immediate reference translation length of h length.
CN201810428618.6A 2018-05-07 2018-05-07 A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy Pending CN108829684A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810428618.6A CN108829684A (en) 2018-05-07 2018-05-07 A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810428618.6A CN108829684A (en) 2018-05-07 2018-05-07 A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy

Publications (1)

Publication Number Publication Date
CN108829684A true CN108829684A (en) 2018-11-16

Family

ID=64148400

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810428618.6A Pending CN108829684A (en) 2018-05-07 2018-05-07 A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy

Country Status (1)

Country Link
CN (1) CN108829684A (en)

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408831A (en) * 2018-10-11 2019-03-01 成都信息工程大学 A kind of remote supervisory method of Chinese medicine fine granularity syndrome name segmentation
CN109670180A (en) * 2018-12-21 2019-04-23 语联网(武汉)信息技术有限公司 The method and device of the translation personal characteristics of vectorization interpreter
CN109740169A (en) * 2019-01-09 2019-05-10 北京邮电大学 A kind of Chinese medical book interpretation method based on dictionary and seq2seq pre-training mechanism
CN110059802A (en) * 2019-03-29 2019-07-26 阿里巴巴集团控股有限公司 For training the method, apparatus of learning model and calculating equipment
CN110083842A (en) * 2019-03-27 2019-08-02 华为技术有限公司 Translation quality detection method, device, machine translation system and storage medium
CN110110061A (en) * 2019-04-26 2019-08-09 同济大学 Low-resource languages entity abstracting method based on bilingual term vector
CN110188331A (en) * 2019-06-03 2019-08-30 腾讯科技(深圳)有限公司 Model training method, conversational system evaluation method, device, equipment and storage medium
CN110210536A (en) * 2019-05-22 2019-09-06 北京邮电大学 A kind of the physical damnification diagnostic method and device of optical interconnection system
CN110245364A (en) * 2019-06-24 2019-09-17 中国科学技术大学 The multi-modal neural machine translation method of zero parallel corpora
CN110334362A (en) * 2019-07-12 2019-10-15 北京百奥知信息科技有限公司 A method of the solution based on medical nerve machine translation generates untranslated word
CN110457719A (en) * 2019-10-08 2019-11-15 北京金山数字娱乐科技有限公司 A kind of method and device of translation model result reordering
CN110472253A (en) * 2019-08-15 2019-11-19 哈尔滨工业大学 A kind of Sentence-level mechanical translation quality estimation model training method based on combination grain
CN110472252A (en) * 2019-08-15 2019-11-19 昆明理工大学 The method of the more neural machine translation of the Chinese based on transfer learning
CN110674646A (en) * 2019-09-06 2020-01-10 内蒙古工业大学 Mongolian Chinese machine translation system based on byte pair encoding technology
CN110674648A (en) * 2019-09-29 2020-01-10 厦门大学 Neural network machine translation model based on iterative bidirectional migration
CN110688862A (en) * 2019-08-29 2020-01-14 内蒙古工业大学 Mongolian-Chinese inter-translation method based on transfer learning
CN111008533A (en) * 2019-12-09 2020-04-14 北京字节跳动网络技术有限公司 Method, device, equipment and storage medium for obtaining translation model
CN111046677A (en) * 2019-12-09 2020-04-21 北京字节跳动网络技术有限公司 Method, device, equipment and storage medium for obtaining translation model
CN111144140A (en) * 2019-12-23 2020-05-12 语联网(武汉)信息技术有限公司 Zero-learning-based Chinese and Tai bilingual corpus generation method and device
CN112016604A (en) * 2020-08-19 2020-12-01 华东师范大学 Zero-resource machine translation method applying visual information
CN112101047A (en) * 2020-08-07 2020-12-18 江苏金陵科技集团有限公司 Machine translation method for matching language-oriented precise terms
CN112633018A (en) * 2020-12-28 2021-04-09 内蒙古工业大学 Mongolian Chinese neural machine translation method based on data enhancement
WO2021109679A1 (en) * 2019-12-06 2021-06-10 中兴通讯股份有限公司 Method for constructing machine translation model, translation apparatus and computer readable storage medium
CN112989848A (en) * 2021-03-29 2021-06-18 华南理工大学 Training method for neural machine translation model of field adaptive medical literature
CN113033218A (en) * 2021-04-16 2021-06-25 沈阳雅译网络技术有限公司 Machine translation quality evaluation method based on neural network structure search
CN113239708A (en) * 2021-04-28 2021-08-10 华为技术有限公司 Model training method, translation method and translation device
CN113408302A (en) * 2021-06-30 2021-09-17 澳门大学 Method, device, equipment and storage medium for evaluating machine translation result
CN113642341A (en) * 2021-06-30 2021-11-12 深译信息科技(横琴)有限公司 Deep confrontation generation method for solving scarcity of medical text data
CN113657128A (en) * 2021-08-25 2021-11-16 四川大学 Learning translation system and storage medium based on importance measurement and low resource migration
CN113822078A (en) * 2021-08-20 2021-12-21 北京中科凡语科技有限公司 XLM-R model fused machine translation model training method
WO2022116819A1 (en) * 2020-12-04 2022-06-09 北京有竹居网络技术有限公司 Model training method and apparatus, machine translation method and apparatus, and device and storage medium
CN115270826A (en) * 2022-09-30 2022-11-01 北京澜舟科技有限公司 Multilingual translation model construction method, translation method and computer storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
李亚超 等: ""藏汉神经网络机器翻译研究"", 《中文信息学报》 *
杜建: ""融合统计机器翻译特征的蒙汉神经网络机器翻译技术"", 《中国优秀硕士学位论文全文库 信息科技辑》 *
苏依拉 等: ""基于统计分析的蒙汉自然语言的机器翻译"", 《北京工业大学学报》 *
赵伟: ""条件随机场在蒙古语词切分中的应用"", 《中国优秀硕士学位论文全文库 信息科技辑》 *

Cited By (51)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408831B (en) * 2018-10-11 2020-02-21 成都信息工程大学 Remote supervision method for traditional Chinese medicine fine-grained syndrome name segmentation
CN109408831A (en) * 2018-10-11 2019-03-01 成都信息工程大学 A kind of remote supervisory method of Chinese medicine fine granularity syndrome name segmentation
CN109670180A (en) * 2018-12-21 2019-04-23 语联网(武汉)信息技术有限公司 The method and device of the translation personal characteristics of vectorization interpreter
CN109670180B (en) * 2018-12-21 2020-05-08 语联网(武汉)信息技术有限公司 Method and device for translating individual characteristics of vectorized translator
CN109740169A (en) * 2019-01-09 2019-05-10 北京邮电大学 A kind of Chinese medical book interpretation method based on dictionary and seq2seq pre-training mechanism
CN109740169B (en) * 2019-01-09 2020-10-13 北京邮电大学 Traditional Chinese medicine ancient book translation method based on dictionary and seq2seq pre-training mechanism
CN110083842A (en) * 2019-03-27 2019-08-02 华为技术有限公司 Translation quality detection method, device, machine translation system and storage medium
CN110083842B (en) * 2019-03-27 2023-10-03 华为技术有限公司 Translation quality detection method, device, machine translation system and storage medium
CN110059802A (en) * 2019-03-29 2019-07-26 阿里巴巴集团控股有限公司 For training the method, apparatus of learning model and calculating equipment
US11514368B2 (en) 2019-03-29 2022-11-29 Advanced New Technologies Co., Ltd. Methods, apparatuses, and computing devices for trainings of learning models
CN110110061B (en) * 2019-04-26 2023-04-18 同济大学 Low-resource language entity extraction method based on bilingual word vectors
CN110110061A (en) * 2019-04-26 2019-08-09 同济大学 Low-resource languages entity abstracting method based on bilingual term vector
CN110210536A (en) * 2019-05-22 2019-09-06 北京邮电大学 A kind of the physical damnification diagnostic method and device of optical interconnection system
CN110188331A (en) * 2019-06-03 2019-08-30 腾讯科技(深圳)有限公司 Model training method, conversational system evaluation method, device, equipment and storage medium
CN110188331B (en) * 2019-06-03 2023-05-26 腾讯科技(深圳)有限公司 Model training method, dialogue system evaluation method, device, equipment and storage medium
CN110245364A (en) * 2019-06-24 2019-09-17 中国科学技术大学 The multi-modal neural machine translation method of zero parallel corpora
CN110245364B (en) * 2019-06-24 2022-10-28 中国科学技术大学 Zero-parallel corpus multi-modal neural machine translation method
CN110334362B (en) * 2019-07-12 2023-04-07 北京百奥知信息科技有限公司 Method for solving and generating untranslated words based on medical neural machine translation
CN110334362A (en) * 2019-07-12 2019-10-15 北京百奥知信息科技有限公司 A method of the solution based on medical nerve machine translation generates untranslated word
CN110472252B (en) * 2019-08-15 2022-12-13 昆明理工大学 Method for translating Hanyue neural machine based on transfer learning
CN110472252A (en) * 2019-08-15 2019-11-19 昆明理工大学 The method of the more neural machine translation of the Chinese based on transfer learning
CN110472253A (en) * 2019-08-15 2019-11-19 哈尔滨工业大学 A kind of Sentence-level mechanical translation quality estimation model training method based on combination grain
CN110472253B (en) * 2019-08-15 2022-10-25 哈尔滨工业大学 Sentence-level machine translation quality estimation model training method based on mixed granularity
CN110688862A (en) * 2019-08-29 2020-01-14 内蒙古工业大学 Mongolian-Chinese inter-translation method based on transfer learning
CN110674646A (en) * 2019-09-06 2020-01-10 内蒙古工业大学 Mongolian Chinese machine translation system based on byte pair encoding technology
CN110674648A (en) * 2019-09-29 2020-01-10 厦门大学 Neural network machine translation model based on iterative bidirectional migration
CN110674648B (en) * 2019-09-29 2021-04-27 厦门大学 Neural network machine translation model based on iterative bidirectional migration
CN110457719B (en) * 2019-10-08 2020-01-07 北京金山数字娱乐科技有限公司 Translation model result reordering method and device
CN110457719A (en) * 2019-10-08 2019-11-15 北京金山数字娱乐科技有限公司 A kind of method and device of translation model result reordering
WO2021109679A1 (en) * 2019-12-06 2021-06-10 中兴通讯股份有限公司 Method for constructing machine translation model, translation apparatus and computer readable storage medium
CN111046677B (en) * 2019-12-09 2021-07-20 北京字节跳动网络技术有限公司 Method, device, equipment and storage medium for obtaining translation model
CN111046677A (en) * 2019-12-09 2020-04-21 北京字节跳动网络技术有限公司 Method, device, equipment and storage medium for obtaining translation model
CN111008533A (en) * 2019-12-09 2020-04-14 北京字节跳动网络技术有限公司 Method, device, equipment and storage medium for obtaining translation model
CN111144140B (en) * 2019-12-23 2023-07-04 语联网(武汉)信息技术有限公司 Zhongtai bilingual corpus generation method and device based on zero-order learning
CN111144140A (en) * 2019-12-23 2020-05-12 语联网(武汉)信息技术有限公司 Zero-learning-based Chinese and Tai bilingual corpus generation method and device
CN112101047A (en) * 2020-08-07 2020-12-18 江苏金陵科技集团有限公司 Machine translation method for matching language-oriented precise terms
CN112016604A (en) * 2020-08-19 2020-12-01 华东师范大学 Zero-resource machine translation method applying visual information
WO2022116819A1 (en) * 2020-12-04 2022-06-09 北京有竹居网络技术有限公司 Model training method and apparatus, machine translation method and apparatus, and device and storage medium
CN112633018B (en) * 2020-12-28 2022-04-15 内蒙古工业大学 Mongolian Chinese neural machine translation method based on data enhancement
CN112633018A (en) * 2020-12-28 2021-04-09 内蒙古工业大学 Mongolian Chinese neural machine translation method based on data enhancement
CN112989848A (en) * 2021-03-29 2021-06-18 华南理工大学 Training method for neural machine translation model of field adaptive medical literature
CN113033218B (en) * 2021-04-16 2023-08-15 沈阳雅译网络技术有限公司 Machine translation quality evaluation method based on neural network structure search
CN113033218A (en) * 2021-04-16 2021-06-25 沈阳雅译网络技术有限公司 Machine translation quality evaluation method based on neural network structure search
CN113239708A (en) * 2021-04-28 2021-08-10 华为技术有限公司 Model training method, translation method and translation device
CN113642341A (en) * 2021-06-30 2021-11-12 深译信息科技(横琴)有限公司 Deep confrontation generation method for solving scarcity of medical text data
CN113408302A (en) * 2021-06-30 2021-09-17 澳门大学 Method, device, equipment and storage medium for evaluating machine translation result
CN113822078A (en) * 2021-08-20 2021-12-21 北京中科凡语科技有限公司 XLM-R model fused machine translation model training method
CN113822078B (en) * 2021-08-20 2023-09-08 北京中科凡语科技有限公司 Training method of machine translation model fused with XLM-R model
CN113657128B (en) * 2021-08-25 2023-04-07 四川大学 Learning translation system and storage medium based on importance measurement and low resource migration
CN113657128A (en) * 2021-08-25 2021-11-16 四川大学 Learning translation system and storage medium based on importance measurement and low resource migration
CN115270826A (en) * 2022-09-30 2022-11-01 北京澜舟科技有限公司 Multilingual translation model construction method, translation method and computer storage medium

Similar Documents

Publication Publication Date Title
CN108829684A (en) A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy
CN110334361B (en) Neural machine translation method for Chinese language
CN106650813B (en) A kind of image understanding method based on depth residual error network and LSTM
CN110348016B (en) Text abstract generation method based on sentence correlation attention mechanism
CN106126507B (en) A kind of depth nerve interpretation method and system based on character code
CN109376242B (en) Text classification method based on cyclic neural network variant and convolutional neural network
CN109359294B (en) Ancient Chinese translation method based on neural machine translation
CN109117483B (en) Training method and device of neural network machine translation model
CN109635124A (en) A kind of remote supervisory Relation extraction method of combination background knowledge
CN107967318A (en) A kind of Chinese short text subjective item automatic scoring method and system using LSTM neutral nets
CN110688862A (en) Mongolian-Chinese inter-translation method based on transfer learning
CN109359291A (en) A kind of name entity recognition method
CN106569998A (en) Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN110134946B (en) Machine reading understanding method for complex data
CN108897740A (en) A kind of illiteracy Chinese machine translation method based on confrontation neural network
CN109977234A (en) A kind of knowledge mapping complementing method based on subject key words filtering
CN110619127B (en) Mongolian Chinese machine translation method based on neural network turing machine
CN113468895B (en) Non-autoregressive neural machine translation method based on decoder input enhancement
CN108932232A (en) A kind of illiteracy Chinese inter-translation method based on LSTM neural network
CN110909736A (en) Image description method based on long-short term memory model and target detection algorithm
CN108765383A (en) Video presentation method based on depth migration study
CN110162789A (en) A kind of vocabulary sign method and device based on the Chinese phonetic alphabet
CN110837736B (en) Named entity recognition method of Chinese medical record based on word structure
CN110442880B (en) Translation method, device and storage medium for machine translation
CN111414770B (en) Semi-supervised Mongolian neural machine translation method based on collaborative training

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181116