CN107329960A

CN107329960A - Unregistered word translating equipment and method in a kind of neural network machine translation of context-sensitive

Info

Publication number: CN107329960A
Application number: CN201710514935.5A
Authority: CN
Inventors: 杨沐昀; 朱聪慧; 赵铁军; 张红阳; 徐冰; 曹海龙; 郑德权
Original assignee: Harbin Institute of Technology
Current assignee: Heilongjiang Industrial Technology Research Institute Asset Management Co ltd
Priority date: 2017-06-29
Filing date: 2017-06-29
Publication date: 2017-11-07
Anticipated expiration: 2037-06-29
Also published as: CN107329960B

Abstract

The present invention propose a kind of context-sensitive neural network machine translation in unregistered word translating equipment and method, belong to lexical translation apparatus and method technical field.Unregistered word translating equipment proposed by the present invention provides module, feature extraction module, evaluation module, order module and replacement module by searching modul, candidate word and realizes neutral net translation, unregistered word translating equipment proposed by the present invention solves the problem of existing translating equipment and low method translation degree of accuracy, and effectively increase the degree of accuracy that unregistered word is translated in neutral net translation, meanwhile, it is applied to various neutral nets and translates field.

Description

A kind of context-sensitive neural network machine translation in unregistered word translating equipment and Method

Technical field

The present invention relates to a kind of lexical translation apparatus and method, belong to lexical translation apparatus and method technical field.

Background technology

Neural network machine translation (neural machine translation, NMT) is a kind of new machine translation side Method, core is a kind of the simple of end-to-end training and is easy to extensive depth neural network.This Web vector graphic one kind coding- The structure of decoding, coded portion is responsible for the semantic vector that former end sentence is encoded into regular length to represent, decoded portion is one Recognition with Recurrent Neural Network (recurrent neural network, RNN), it is believed using the history of former end sentence expression and destination end Cease the machine translation sentence that word one by one decodes destination end.Machine between multilingual is turned over since this network is suggested Translate and effect best at present is all achieved in task, such as Great Britain and France translate, Germany and Britain's translation, the translation of English Czech.

In actual model realization, due to the limitation of amount of calculation and GPU internal memories, NMT models need to be determined in advance one Unregistered word (out of vocabulary, OOV) outside the everyday words vocabulary being very limited, other vocabularys all uses special symbol Number<unk>(unknown) mark, vocabulary size is generally set as 30000 to 80000.Because translation is an open vocabulary problem, So a large amount of unregistered words for enriching semanteme are expressed as one without semanteme<unk>Mark can greatly increase former end sentence Ambiguousness.Meanwhile, once included in the translation of generation<unk>, all do not stepped on due to having been abandoned during NMT model translations Word information is recorded, so can not be to these<unk>Handled, we can only be after the completion of translation in translation result<unk> Post-processed.

At present, what is be most widely used is a kind of greedy post-processing approach：Word alignment information is recorded in NMT models, Here usually using notice mechanism (attention mechanism), found according to alignment information<unk>Align maximum probability Former terminal word, realize that the dictionary for translation constructed finds the translation candidate of former terminal word in advance by one afterwards, in selection dictionary The maximum word of translation probability is in translation result<unk>It is replaced.This method is also to be contrasted during the present embodiment is tested Baseline Methods.

This method is given<unk>The substitute found is proved that NMT translation result can be lifted by many experiments Quality, but be due to not accounted for when replacing in translation result<unk>Contextual information, so remaining Many problems.Because in translation process, existing a large amount of " one-to-many ", " many-one ", the translation mapping of " multi-to-multi ", simultaneously Even in the case of " one-to-one " translation, same former terminal word may also need to translate into different mesh under different linguistic context Mark terminal word.In face of these complicated translation phenomenons, using above-mentioned greedy post-processing approach, then substantial amounts of replacement can be caused wrong By mistake, replace and repeat, the problem of sentence is not clear and coherent after replacement.

The content of the invention

Translation of the present invention in order to solve existing neutral net machine translator can not meet and linguistic context from the context or language Adopted the problem of, it is proposed that unregistered word translating equipment and method in a kind of neural network machine translation of context-sensitive.

Unregistered word translating equipment in a kind of neural network machine translation of context-sensitive, the technical scheme taken is such as Under：

The unregistered word translating equipment includes：

According to all former terminal words, the searching modul of search terms in dictionary for translation；

It is according to the lookup word result that the searching modul is obtained<unk>Mark provides possible unregistered word candidate and turned over The candidate's translation translated provides module；

The feature extraction module of contextual feature is extracted for being translated for the candidate；

For the contextual feature, the unregistered word candidate translation is obtained using the SVM rank models trained Evaluation index, and unregistered word candidate translation is ranked up by the order of evaluation index from high in the end according to evaluation index Order module；

For evaluation index sequence highest unregistered word candidate's translation to be replaced in the translation of the sentence<unk>Mark Replacement module, obtain and meet the complete translation sentence of context of co-text.

Further, the feature extraction module includes：

Word alignment characteristic extracting module for extracting word alignment feature from NMT notice alignment models；

For the word grain size characteristic extraction module for the word grain size characteristic for extracting former terminal word and unregistered word candidate translation；

For the phrase grain size characteristic extraction module for the phrase grain size characteristic for extracting former terminal word and unregistered word candidate translation；

Appeared in for extracting unregistered word candidate translation<unk>During mark position, near unregistered word candidate translation The language model characteristic extracting module of language model feature.

Further, institute's predicate grain size characteristic extraction module includes：

The positive translation probability module of unregistered word candidate translation is translated for former terminal word；

The reverse translation probabilistic module of former terminal word is translated for unregistered word candidate；

The former terminal word number of times extraction module of the number of times occurred in parallel corpora is trained in NMT for extracting former terminal word；

Unregistered word candidate for extracting the number of times that unregistered word candidate translation occurs in NMT training parallel corporas turns over Translate number of times extraction module；

For extracting being total to for the co-occurrence number of times of former terminal word and unregistered word candidate translation in the parallel sentence pair of parallel corpora Occurrence number extraction module；

The vocabulary position extraction module for appearing in the position in vocabulary for extracting former terminal word；

For judge former terminal word whether be unregistered word judge module.

Further, the phrase grain size characteristic is extracted and included：

Number of times extraction module in former terminal word phrase table for extracting the number of times that former terminal word occurs in phrase table；

Unregistered word candidate for extracting the number of times that unregistered word candidate translation occurs in phrase table translates phrase table Middle number of times extraction module one；

For extracting the short of the co-occurrence number of times of former terminal word and unregistered word candidate translation in each phrase pair of phrase table Co-occurrence number of times extraction module in language table；

Phrase number of times extraction module for extracting phrase occurrence number in phrase table that former terminal word is constituted with front and rear word；

When constituting phrase with front and rear word for extracting former terminal word, unregistered word candidate translation is appeared in correspondence object phrase Number of times unregistered word candidate translation phrase table in number of times extraction module two；

For extracting the phrase of former terminal word and unregistered word candidate translation respectively with front and rear word composition to appearing in phrase table When middle, the unregistered word candidate translation phrase length extraction module of the maximum length of unregistered word candidate translation phrase；

For extracting the phrase of former terminal word and unregistered word candidate translation respectively with front and rear word composition to appearing in phrase table When middle, and during unregistered word candidate translation phrase acquirement maximum length, the former terminal word phrase length of former terminal word phrase length is extracted Module.

Further, the language model characteristic extracting module includes：

Appeared in for extracting unregistered word candidate translation in continuously translation sequence of terms<unk>During mark position not The positive n gram language models probability extraction module of the positive n gram language models probability of posting term candidate translation；

Appeared in for extracting unregistered word candidate translation in continuously translation sequence of terms<unk>During mark position not The reverse n gram language models probability extraction module of the reverse n gram language models probability of posting term candidate translation；

Appeared in for extracting unregistered word candidate translation<unk>During mark position, the unregistered word of corresponding first number is included The word string number of extracted module of the word string quantity of candidate's translation.

Unregistered word interpretation method in a kind of neural network machine translation of context-sensitive, the technical scheme taken is such as Under：

The unregistered word interpretation method includes：

According to all former terminal words, the finding step of search terms in dictionary for translation；

It is according to the lookup word result that the finding step is obtained<unk>Mark provides possible unregistered word candidate and turned over The candidate's translation translated provides step；

The feature extraction step of contextual feature is extracted for being translated for the candidate；

For the contextual feature, the unregistered word candidate translation is obtained using the SVM rank models trained Evaluation index, and unregistered word candidate translation is ranked up by the order of evaluation index from high in the end according to evaluation index Sequence step；

For evaluation index sequence highest unregistered word candidate's translation to be replaced in the translation of the sentence<unk>Mark Replacement step, obtain and meet the complete translation sentence of context of co-text.

Further, the feature extraction step includes：

Word alignment characteristic extraction step for extracting word alignment feature from NMT notice alignment models；

For the word grain size characteristic extraction step for the word grain size characteristic for extracting former terminal word and unregistered word candidate translation；

For the phrase grain size characteristic extraction step for the phrase grain size characteristic for extracting former terminal word and unregistered word candidate translation；

For extracting unregistered word candidate translation appearance<unk>During mark position, the language near unregistered word candidate translation Say the language model characteristic extraction step of the aspect of model.

Further, institute's predicate grain size characteristic extraction step includes：

The positive translation probability step of unregistered word candidate translation is translated for former terminal word；

The reverse translation probability step of former terminal word is translated for unregistered word candidate；

The former terminal word number of times extraction step of the number of times occurred in parallel corpora is trained in NMT for extracting former terminal word；

Unregistered word candidate for extracting the number of times that unregistered word candidate translation occurs in NMT training parallel corporas turns over Translate number of times extraction step；

For extracting being total to for the co-occurrence number of times of former terminal word and unregistered word candidate translation in the parallel sentence pair of parallel corpora Occurrence number extraction step；

The vocabulary position extraction step for appearing in the position in vocabulary for extracting former terminal word；

For judge former terminal word whether be unregistered word judgment step.

Further, the phrase grain size characteristic is extracted and included：

Number of times extraction step in former terminal word phrase table for extracting the number of times that former terminal word occurs in phrase table；

Unregistered word candidate for extracting the number of times that unregistered word candidate translation occurs in phrase table translates phrase table Middle number of times extraction step one；

For extracting the short of the co-occurrence number of times of former terminal word and unregistered word candidate translation in each phrase pair of phrase table Co-occurrence number of times extraction step in language table；

Phrase number of times extraction step for extracting phrase occurrence number in phrase table that former terminal word is constituted with front and rear word；

When constituting phrase with front and rear word for extracting former terminal word, unregistered word candidate translation is appeared in correspondence object phrase Number of times unregistered word candidate translation phrase table in number of times extraction step two；

For extracting the phrase of former terminal word and unregistered word candidate translation respectively with front and rear word composition to appearing in phrase table When middle, the unregistered word candidate translation phrase length extraction step of the maximum length of unregistered word candidate translation phrase；

For extracting the phrase of former terminal word and unregistered word candidate translation respectively with front and rear word composition to appearing in phrase table When middle, and during unregistered word candidate translation phrase acquirement maximum length, the former terminal word phrase length of former terminal word phrase length is extracted Step.

Further, the language model characteristic extraction step includes：

Appeared in for extracting unregistered word candidate translation in continuously translation sequence of terms<unk>During mark position not The positive n gram language models probability extraction step of the positive n gram language models probability of posting term candidate translation；

Appeared in for extracting unregistered word candidate translation in continuously translation sequence of terms<unk>During mark position not The reverse n gram language models probability extraction step of the reverse n gram language models probability of posting term candidate translation；

Appeared in for extracting unregistered word candidate translation<unk>During mark position, the unregistered word of corresponding first number is included The word string number of extracted step of the word string quantity of candidate's translation.

Beneficial effect of the present invention：

Unregistered word translating equipment and method can be done in the neural network machine translation of context-sensitive of the present invention The context of co-text and semanteme for translating word to junction belt are translated, and are made to translate the BLEU values of the word word come and are not stepped on Record word recall rate more preferably, in-English translation duties in NIST data sets on its BLEU and unregistered word recall rate be respectively 33.405 and 6.53% improve 0.012 He respectively than the 33.393 of the greedy post-processing approach of prior art and 6.16% 0.37%；More it has been obviously improved the translation quality of unregistered word in NMT translation results.

Brief description of the drawings

Fig. 1 shows for the structure of unregistered word translating equipment in the neural network machine translation of context-sensitive of the present invention It is intended to.

Fig. 2 is word grain size characteristic extraction module structural representation of the present invention.

Fig. 3 is phrase grain size characteristic extraction module structural representation of the present invention.

Fig. 4 is language model characteristic extracting module structural representation of the present invention.

Fig. 5 illustrates for the case of unregistered word translating equipment in the neural network machine translation of context-sensitive of the present invention It is intended to.

Embodiment

With reference to specific embodiment, the present invention will be further described, but the present invention should not be limited by the examples.

Embodiment 1：

As shown in Figures 1 to 4, unregistered word translating equipment, institute in a kind of neural network machine translation of context-sensitive The technical scheme taken is as follows：

The unregistered word translating equipment includes：

The lookup word result obtained according to the searching modul provides possible unregistered word candidate for unregistered word and turned over The candidate's translation translated provides module；

Wherein, the feature extraction module includes：

Institute's predicate grain size characteristic extraction module includes：

For judge former terminal word whether be unregistered word judge module.

The phrase grain size characteristic, which is extracted, to be included：

Wherein, the language model characteristic extracting module includes：

The unregistered word interpretation method includes：

Wherein, the feature extraction step includes：

Appeared in for extracting unregistered word candidate translation<unk>During mark position, near unregistered word candidate translation The language model characteristic extraction step of language model feature.

Institute's predicate grain size characteristic extraction step includes：

For the position extraction step for the vocabulary position appeared in vocabulary for extracting former terminal word；

For judge former terminal word whether be unregistered word judgment step.

The phrase grain size characteristic, which is extracted, to be included：

When constituting phrase with front and rear word for extracting former terminal word, unregistered word candidate translation appears in correspondence object phrase table In number of times unregistered word candidate translation phrase in number of times extraction step two；

The language model characteristic extraction step includes：

Unregistered word translating equipment and method in a kind of neural network machine translation of context-sensitive described in the present embodiment, Its experiment the results are shown in Table shown in 1.In table 1, NMT word alignment features are 1. represented, word grain size characteristic is 2. represented, 3. represent phrase grain Feature is spent, language model feature is 4. represented.

It can see from table 1, highest accuracy rate 45.12% reached using the model of whole features trainings.

36.89% than greedy post-processing approach is high by 8.23%.

The model of table 1 post-processes the effect in construction data in unregistered word

The NMT actual translations result under open environment, experimental result is as shown in table 2.Here after the greed that we compare Directly deleted when processing method and post processing<unk>It is all tests to mark BLEU and Recall (OOV) in two methods, table Average value on collection 2.It will be seen that our model has been above greed on Recall (OOV) and BLEU from table 2 Unregistered word processing method.Unregistered word in the neural network machine translation for the context-sensitive that this explanation present invention is extracted Translating equipment and method relative to existing greedy post-processing approach for having significant technological progress.

Effect of the model of word scope on the true translation results of NMT is selected in the extension of table 2

Embodiment 2

Embodiment 2 is that unregistered word is translated in being translated to a kind of neural network machine of context-sensitive described in embodiment 1 The further refinement of method, finds a most suitable word using contextual information and goes to replace in NMT translation results<unk>Mark Note (is used for representing unregistered word) in NMT.The dictionary for translation that unregistered word translating equipment has been constructed according to former terminal word and in advance is carried Take unregistered word candidate to translate, while recording the former terminal word for producing this unregistered word candidate translation, be not logged in for each Word candidate translates and former terminal word is extracted 4 class contextual features from different angles to combining former sentence and translation result：NMT words Alignment feature, word grain size characteristic, phrase grain size characteristic, language model feature, finally using all 4 classes of svm-rank models couplings Feature goes sequence to obtain optimal substitute, and unregistered word interpretation method is turned in the neural network machine translation of the context-sensitive Translate<unk>The detailed process of mark is as follows：

Given one carries<unk>The translation of the sentence of mark former end sentence corresponding with its, the translation flow of this method is as follows：

Step one：Searching dictionary for translation according to all former terminal words is<unk>Possible unregistered word candidate translation is provided.

Step 2：To be each<unk>Unregistered word candidate translation extract contextual feature.

Step 3：Based on context it is characterized as that all unregistered word candidates translate using the SVM rank models trained It is ranked up.

Replaced using sequence highest word in translation of the sentence<unk>Mark.

Wherein, the SVM rank models belong to sequence study in the classes of pairwise mono- method, be for learn to Candidate list sorts rather than two classification tasks.Assume there is one for sorted lists rank in Rank SVM basic assumption Linear function f (x)=w^tX+b is metUnderstand that SVMrank is substantially also linear Certain fraction is fitted, only this fraction does not ensure identical with authentic assessment index, is merely able to ensure to use this fraction pair The result of candidate's sequence is consistent with using authentic assessment index.The present invention adds slack variable to locate in SVM rank models Manage the noise in input and increase generalization ability, therefore the formal structure of the model mathematics added after slack variable is：

subjectto

Wherein x_iAnd y_iIt is candidate i feature and evaluation index, x respectively_jAnd y_jIt is that candidate j feature and evaluation refer to respectively Mark, ξ_{I, j}It is slack variable.

After SVM rank models are selected, whether the feature of input has the pass that distinction is decision model performance quality Key.

Wherein, the process of model training is：

1), SVM rank model trainings data set

The present embodiment from LDC2002E18, LDC2003E07, LDC2003E14, LDC2004T07, LDC2004T08, This 7 data of LDC2005T06 and LDC2005T10 concentrate extracted in 2,100,000-English parallel corpora as NMT training number According to wherein including 5.4 thousand ten thousand Chinese words and 6,000 ten thousand English words respectively.The present embodiment filters out 25 from NMT training corpus Ten thousand parallel corporas with unregistered word, 320,000 unregistered words post processing training examples are constructed with these language materials.Every It is individual training example Central Plains end sentence in all words be all<unk>Mark provides unregistered word candidate translation, and the scope of candidate is Unregistered word in dictionary for translation in maximum preceding 100 words of translation probability.Final average each training example has 65 not Posting term candidate translates.

The order models training data sample of table 3

Table 3 is order models training data sample, and the 1st, 2,3 row are respectively sequence number, candidate's translation and corresponding source word. 5 to 32nd row are respectively alignment feature, word grain size characteristic, phrase grain size characteristic and language model feature.Each candidate's translation is logical Source word lookup dictionary for translation is crossed to obtain.The present embodiment forces the mode of decoding to obtain the notice pair of training data using NMT Neat feature, statistics obtains word grain size characteristic and language model feature in 2,100,000 parallel corporas, the phrase table built in Moses Middle extracting phrase grain size characteristic.

The present embodiment uses " grow-diag-final " method of standard using GIZA++ instruments on 2,100,000 parallel corporas A two-way word alignment matrix is obtained, based on this word alignment result, the present embodiment calculates former end using maximum likelihood method Word to target terminal word positive translation probability and target terminal word to the reverse translation probability of former terminal word, each word is at most protected in dictionary Hold 200 translation candidate's translations.Last the present embodiment obtains former end and used to destination end and destination end to two dictionary for translation at former end In offer unregistered word candidate and the forward and reverse translation probability feature of extraction.

In addition, the present embodiment is extracted 4 class contextual features from different perspectives, as shown in figure 5, four class contextual feature bags Include：1. it is the word alignment feature extracted from NMT notice alignment models, is 2. that extraction source terminal word and unregistered word candidate translate Word grain size characteristic, 3. be extraction source terminal word and unregistered word candidate translation phrase grain size characteristic, 4. be extract unregistered word Candidate's translation is appeared in<unk>During mark position, the language model feature near unregistered word candidate translation.

Wherein, as shown in Fig. 2 by taking the former end sentence in Fig. 2 as an example：

1. NMT word alignments feature

Translated for each candidate and produce its former terminal word pair, we extract a NMT word alignment feature first, this It is characterized in that NMT is produced<unk>When the notice fraction (attention scores) that produces, it is represented in translation result<unk >Snap to the probability of former terminal word.This fraction is that NMT is produced in a model, while being also connection<unk>With the weight of former terminal word Want information.

2. word grain size characteristic

Candidate's translation corresponding with its for each former terminal word, what we first had to extraction is the two words in language material Cooccurrence relation, and themselves statistical information in language material.The present embodiment has extracted 7 word granularity contextual features：

●p(t|s)：Former terminal word translates the positive translation probability of candidate's translation.

●p(s|t)：Candidate translates the reverse probability of former terminal word.

●number_in_corpus(s)：Former terminal word trains the number of times occurred in parallel corpora in NMT.

●number_in_corpus(t)：The number of times occurred in parallel corpora is trained in candidate's translation in NMT.

●number_cooc_in_corpus(s,t)：Former terminal word and candidate's translation are in the parallel sentence pair of parallel corpora Co-occurrence number of times.

●freq_in_vocab(s)：Vocabulary is arranged out from big to small by the frequency of word in parallel corpora, former terminal word Appear in the position in vocabulary.

●1if s is OOV else 0：Whether former terminal word is unregistered word, if former terminal word is unregistered word, feature It is worth for 1, is otherwise 0.

3. phrase grain size characteristic

We further capture the cooccurrence relation and system between the phrase of former terminal word and candidate's translation and its front and rear word composition Count information, this Partial Feature we statistical machine translation instrument Moses generate phrase translation table in counted and extracted. The present embodiment has extracted 7 phrase granularity contextual features：

●number_in_phrase_table(s)：The number of times that former terminal word occurs in phrase table.

●number_in_phrase_table(t)：The number of times that candidate's translation occurs in phrase table.

●number_cooc_in_phrase_table(s,t)：Former terminal word and candidate translate each phrase in phrase table The co-occurrence number of times of centering.

●number_in_phrase_table(phrase(s))：The phrase that former terminal word is constituted with front and rear word is in phrase table Middle occurrence number.

●number_in_phrase_table(phrase(s))if t in phrase table：Former terminal word with it is front and rear When word constitutes phrase, candidate's translation appears in the number of times in correspondence object phrase.

●max_length(t)if cooc(phrase(s),phrase(t))：Former terminal word and candidate translation respectively with it is preceding When the phrase that word is constituted afterwards is to appearing in phrase table, candidate translates the maximum length of phrase.

●length(s)if max_length(t)and cooc(phrase(s),phrase(t))：Former terminal word and candidate It is former when the phrase that translation is constituted with front and rear word respectively is to appearing in phrase table, and during candidate's translation phrase acquirement maximum length Terminal word phrase length.

4. language model feature

Language model is to represent a key character for word context fluency, and the present embodiment is centered on candidate translates, root According to<unk>Front and rear word extracted 15 language model features, for 5 continuous translation word sequences, A B OOV C D：

P (OOV | B), p (C | OOV)：Positive 2 gram language model feature comprising OOV.

● p (B | OOV), p (OOV | C)：Reverse 2 gram language model feature comprising OOV.

● p (OOV | B, A), p (C | OOV, B), p (D | C, OOV)：Positive 3 gram language model feature comprising OOV.

● p (A | B, OOV), p (B | OOV, C), p (OOV | C, D)：Reverse 3 gram language model feature comprising OOV.

● count (B OOV), count (OOV C)：2 yuan of word string quantity comprising OOV.

● count (A B OOV), count (B OOV C), count (OOV C D)：OOV 3 yuan of words are included in sentence

Although the present invention is disclosed as above with preferred embodiment, it is not limited to the present invention, any to be familiar with this The people of technology, without departing from the spirit and scope of the present invention, can do various changes and modification, therefore the protection of the present invention What scope should be defined by claims is defined.

Claims

1. unregistered word translating equipment in a kind of neural network machine translation of context-sensitive, it is characterised in that described not step on Record word translating equipment includes：

It is according to the lookup word result that the searching modul is obtained<unk>Mark provides possible unregistered word candidate translation Word candidate translation provides module；

The feature extraction module of contextual feature is extracted for being translated for institute predicate candidate；

For the contextual feature, the evaluation of the unregistered word candidate translation is obtained using the SVM rank models trained Index, and the row being ranked up by the order of evaluation index from high in the end is translated to the unregistered word candidate according to evaluation index Sequence module；

For evaluation index sequence highest unregistered word candidate's translation to be replaced in the translation of the sentence<unk>What is marked replaces Block is changed the mold, the complete translation sentence for meeting context of co-text is obtained.

2. unregistered word translating equipment according to claim 1, it is characterised in that the feature extraction module includes：

Appeared in for extracting unregistered word candidate translation<unk>During mark position, the language near unregistered word candidate translation The language model characteristic extracting module of the aspect of model.

3. unregistered word translating equipment according to claim 3, it is characterised in that institute's predicate grain size characteristic extraction module bag Include：

The unregistered word candidate translation time of the number of times occurred in parallel corpora is trained in NMT for extracting unregistered word candidate translation Number extraction module；

Co-occurrence time for extracting the co-occurrence number of times of former terminal word and unregistered word candidate translation in the parallel sentence pair of parallel corpora Number extraction module；

For judge former terminal word whether be unregistered word judge module.

4. unregistered word translating equipment according to claim 3, it is characterised in that the phrase grain size characteristic extraction module bag Include：

It is secondary in unregistered word candidate translation phrase table for extracting the number of times that unregistered word candidate translation occurs in phrase table Number extraction module one；

Phrase table for extracting the co-occurrence number of times of former terminal word and unregistered word candidate translation in each phrase pair of phrase table Middle co-occurrence number of times extraction module；

When constituting phrase with front and rear word for extracting former terminal word, unregistered word candidate translation appears in time in correspondence object phrase Number of times extraction module two in several unregistered word candidate translation phrase tables；

For extract former terminal word and unregistered word candidate translation respectively with front and rear word constitute phrase to appearing in phrase table when, The unregistered word candidate translation phrase length extraction module of the maximum length of unregistered word candidate translation phrase；

For extract former terminal word and unregistered word candidate translation respectively with front and rear word constitute phrase to appearing in phrase table when, And during unregistered word candidate translation phrase acquirement maximum length, the former terminal word phrase length extraction module of former terminal word phrase length.

5. unregistered word translating equipment according to claim 3, it is characterised in that the language model characteristic extracting module bag Include：

Appeared in for extracting unregistered word candidate translation in continuously translation sequence of terms<unk>It is not logged in during mark position The positive n gram language models probability extraction module of the positive n gram language models probability of word candidate translation；

Appeared in for extracting unregistered word candidate translation in continuously translation sequence of terms<unk>It is not logged in during mark position The reverse n gram language models probability extraction module of the reverse n gram language models probability of word candidate translation；

Appeared in for extracting unregistered word candidate translation<unk>During mark position, the unregistered word candidate of corresponding first number is included The word string number of extracted module of the word string quantity of translation.

6. unregistered word interpretation method in a kind of neural network machine translation of context-sensitive, it is characterised in that described not step on Record word interpretation method includes：

It is according to the lookup word result that the finding step is obtained<unk>Mark provides possible unregistered word candidate translation Candidate's translation provides step；

For the contextual feature, the evaluation of the unregistered word candidate translation is obtained using the SVM rank models trained Index, and the row being ranked up by the order of evaluation index from high in the end is translated to the unregistered word candidate according to evaluation index Sequence step；

For evaluation index sequence highest unregistered word candidate's translation to be replaced in the translation of the sentence<unk>What is marked replaces Step is changed, the complete translation sentence for meeting context of co-text is obtained.

7. unregistered word interpretation method according to claim 6, it is characterised in that the feature extraction step includes：

Appeared in for extracting unregistered word candidate translation<unk>During mark position, the language near unregistered word candidate translation The language model characteristic extraction step of the aspect of model.

8. unregistered word interpretation method according to claim 7, it is characterised in that institute's predicate grain size characteristic extraction step bag Include：

The unregistered word candidate translation time of the number of times occurred in parallel corpora is trained in NMT for extracting unregistered word candidate translation Number extraction step；

Co-occurrence time for extracting the co-occurrence number of times of former terminal word and unregistered word candidate translation in the parallel sentence pair of parallel corpora Number extraction step；

For judge former terminal word whether be unregistered word judgment step.

9. unregistered word interpretation method according to claim 7, it is characterised in that the phrase grain size characteristic extraction step bag Include：

It is secondary in unregistered word candidate translation phrase table for extracting the number of times that unregistered word candidate translation occurs in phrase table Number extraction step one；

Phrase table for extracting the co-occurrence number of times of former terminal word and unregistered word candidate translation in each phrase pair of phrase table Middle co-occurrence number of times extraction step；

When constituting phrase with front and rear word for extracting former terminal word, unregistered word candidate translation appears in time in correspondence object phrase Number of times extraction step two in several unregistered word candidate translation phrase tables；

For extract former terminal word and unregistered word candidate translation respectively with front and rear word constitute phrase to appearing in phrase table when, The unregistered word candidate translation phrase length extraction step of the maximum length of unregistered word candidate translation phrase；

For extract former terminal word and unregistered word candidate translation respectively with front and rear word constitute phrase to appearing in phrase table when, And during unregistered word candidate translation phrase acquirement maximum length, the former terminal word phrase length extraction step of former terminal word phrase length.

10. unregistered word interpretation method according to claim 7, it is characterised in that the language model characteristic extraction step Including：

Appeared in for extracting unregistered word candidate translation in continuously translation sequence of terms<unk>It is not logged in during mark position The positive n gram language models probability extraction step of the positive n gram language models probability of word candidate translation；

Appeared in for extracting unregistered word candidate translation in continuously translation sequence of terms<unk>It is not logged in during mark position The reverse n gram language models probability extraction step of the reverse n gram language models probability of word candidate translation；

Appeared in for extracting unregistered word candidate translation<unk>During mark position, the unregistered word candidate of corresponding first number is included The word string number of extracted step of the word string quantity of translation.