CN108920472A

CN108920472A - A kind of emerging system and method for the machine translation system based on deep learning

Info

Publication number: CN108920472A
Application number: CN201810722720.7A
Authority: CN
Inventors: 杨沐昀; 朱骏国; 赵铁军; 朱聪慧; 曹海龙; 徐冰; 郑德权
Original assignee: Harbin Institute of Technology
Current assignee: Heilongjiang Industrial Technology Research Institute Asset Management Co ltd
Priority date: 2018-07-04
Filing date: 2018-07-04
Publication date: 2018-11-30
Anticipated expiration: 2038-07-04
Also published as: CN108920472B

Abstract

The present invention relates to the emerging systems and method of a kind of machine translation system based on deep learning, belong to vocabulary integration technology field.The emerging system includes input module, coding module, decoder module and output module；The fusion method includes input step, coding step, decoding step and output step；The emerging system and method have the characteristics that improve machine translation system fusion performance indicator.

Description

A kind of emerging system and method for the machine translation system based on deep learning

Technical field

The present invention relates to the emerging systems and method of a kind of machine translation system based on deep learning, belong to vocabulary fusion Technical field.

Background technique

Machine translation system fusion is that the translation of different machine translation systems is higher from one quality of Combination nova generation A kind of method of translation.Vocabulary grade system globe area therein is that the most popular now method and performance are very good.In order into one It walks lifting system and merges performance, deep neural network model is introduced in machine translation translation control fusion.However nerve net Network model usually requires a large amount of data and is trained, and could be more than classical Fusion Model in performance.It is turned in actual machine It in the task of translation fusion, is influenced by training data scale, directly is usually unable to be managed using the training of small-scale data The performance thought is unable to satisfy user's real time translation demand.

Summary of the invention

The present invention is bad in order to solve in existing lexical system integration technology vocabulary fusion performance, is unable to satisfy for real-time The problem of translation demand, proposes the emerging system and method for a kind of machine translation system based on deep learning.

A kind of emerging system of the machine translation system based on deep learning, the technical solution taken are as follows：

The emerging system of the machine translation system includes：

For inputting the input module of the n machine translation translation of 1 source language sentence and the source language sentence, In, n is the integer greater than 2；

For the source language sentence and n machine translation translation to be encoded into the coding module of vector；

Vector for exporting the coding module is merged the fusing and decoding module for being decoded into sentence again, In, the sentence being decoded into again that merges is translation after merging；

For carrying out the model parameter training module of parameter learning to the fusing and decoding module；The model training module Model foundation mould including translation model of the building one from original language to object language on the basis of original encoding-decoded model Block；

For exporting the output module of translation after the fusion.

Further, the coding module includes：

For the source language sentence to be encoded into the source of the sequence of a vector using bidirectional circulating neural network structure Language sentence encoding submodule；

For the attention mechanism module by introducing source language sentence corresponding to the machine translation translation to described The machine translation text encoding submodule that machine translation translation is encoded；

Include in the machine translation text encoding submodule：

For inputting the input module of machine translation sentence and original language sequence vector；,

For exporting the output module of the sequence vector of machine translation；

For utilizing condition door recirculating network unit (cGRU：Conditional Gated Recurrent Unit) it calculates The vector calculation module of each vector in the sequence vector.Attention mechanism (attention mechanism) is turned in machine The essence translated is that the word alignment relationship between source language and the target language is set up in the distribution of weight.

Wherein, the attention mechanism module of the source language sentence includes：

Coding input module (the h of source language sentence_i)；

The weight calculation module of the coding of source language sentence (calculates each α_i,j)；

Attention vector generation module (c_j)；

Further, the bidirectional circulating neural network structure uses GRU (Gated Recurrent Unit) structure.

Further, the fusing and decoding module includes the cGRUmutil-att (Conditional of attention more than one Gated Recurrent Unit with multi-attention) unit, wherein

CGRU unit is cascade two GRU units.And cGRUmutil-att unit is permitted on the basis of cGRU unit Many notice that force vector is connected together as the input of unit.Further, the search space of the fusing and decoding module Restrictive condition it is as follows：

The vocabulary of final translation is the union for merging translation candidate vocabulary, that is, the word for decoding generation must be in fusion translation The word occurred in candidate；

The original word order that fusion translation is kept when decoding requires successively to select the word in translation candidate from left to right The candidate of word as final translation；

The word that some word is aligned if being used with it cannot be used again, i.e., the word of all alignment can only be most Occur in whole translation candidate primary.

A method of the machine translation system fusion based on deep learning, the technical solution taken are as follows：

The fusion method includes：

For inputting the input step of the n machine translation translation of 1 source language sentence and the source language sentence, In, n is the integer greater than 2；

For the source language sentence and n machine translation translation to be encoded into the coding step of vector；

Vector for exporting the coding step is merged the fusing and decoding step for being decoded into sentence again, In, the sentence being decoded into again that merges is translation after merging；

For carrying out the model parameter training step of parameter learning to the fusing and decoding step；The model training step Model foundation step including translation model of the building one from original language to object language on the basis of original encoding-decoded model Suddenly；

For exporting the output step of translation after the fusion.

Further, the coding step includes：

For the source language sentence to be encoded into the source of the sequence of a vector using bidirectional circulating neural network structure Language sentence coding sub-step；

For the attention mechanism step by introducing source language sentence corresponding to the machine translation translation to described The machine translation text coding sub-step that machine translation translation is encoded；

Include in the machine translation text coding sub-step：

For inputting the input step of machine translation sentence and original language sequence vector；,

For exporting the output step of the sequence vector of machine translation；

For utilizing condition door recirculating network unit (cGRU：Conditional Gated Recurrent Unit) it calculates The vector calculation step of each vector in the original language sequence vector.

Wherein, the attention mechanism step of the source language sentence includes：

Coding input step (the h of source language sentence_i)；

The weight calculation step of the coding of source language sentence (calculates each α_i,j)；

Pay attention to force vector generation step (c_j)；Further, the bidirectional circulating neural network structure uses GRU structure.

Further, the fusing and decoding step includes the cGRUmutil-att unit of attention more than one

Further, the restrictive condition of the search space of the fusing and decoding step is as follows：

Beneficial effect of the present invention：

The emerging system and method for a kind of machine translation system based on deep learning proposed by the present invention, by using double To Recognition with Recurrent Neural Network structure, and the design of structure and processing method to coding module and fusing and decoding module, make machine Translation system fusion needs not move through a large amount of data training, and can make its BLEU-4 score performance is more than classical Fusion Model, And then it obtains machine translation system fusion in performance and is greatly promoted.Also, since the system and method needs on a small quantity It is that it small-scale data training pattern can be used directly in practical applications can be obtained higher word that data, which can be trained, It converges and merges performance, make machine translation system fusion that can carry out docking collocation use, Jin Eryou with other small-scale training patterns Effect improves the practicability of machine translation system fusion.

The emerging system and method for a kind of machine translation system based on deep learning proposed by the present invention are in practical application In without obtaining a large amount of machine translation to the system for participating in fusion in advance vocabulary fusion can be realized, can be effectively reduced acquisition The time that translation expends is translated, the efficiency of vocabulary fusion is largely improved, can effectively meet client's real time translation Demand.

Detailed description of the invention

Fig. 1 is the system structure diagram of the system of the machine translation system fusion of the present invention based on deep learning.

Fig. 2 is the structural schematic diagram of GRU structure of the present invention.

Fig. 3 is the structural schematic diagram of attention mechanism module structure of the present invention.

Fig. 4 is the structural schematic diagram of more attention mechanism module structures of the present invention.

Fig. 5 is the structural schematic diagram of model training module of the present invention.

Specific embodiment

The present invention will be further described combined with specific embodiments below, but the present invention should not be limited by the examples.

Embodiment 1：

A kind of emerging system of the machine translation system based on deep learning, using coding-classical in deep learning Decoded model frame, as shown in Figure 1, the system of machine translation system fusion includes：

For carrying out the model parameter training module of parameter learning to the fusing and decoding module；The model training module Model building module including translation model of the building one from original language to object language on the basis of master mould；

For exporting the output module of translation after the fusion.

In Fig. 1, S is source language sentence, T⁽¹⁾,T⁽²⁾,…T⁽ⁿ⁾It is the n machine translation translation of source language sentence S, H respectively It is the coding of source language sentence, Q⁽¹⁾,Q⁽²⁾,…Q⁽ⁿ⁾It is the coding of n machine translation translation respectively, E is translating for final output Text.System gives a source language sentence S, and corresponding n machine translation result T (1), T (2), T (n), fusion An optimal translation E is found in model trial in translation search space.

Wherein, the coding module includes：

For the source language sentence to be encoded into the source of the sequence of a vector using bidirectional circulating neural network structure Language sentence encoding submodule；The bidirectional circulating neural network structure uses GRU structure, and specific structure is as shown in Figure 2.Scheming In 2, h is the state of GRU unit, and h (underscore) is the transitory state of GRU unit, and z is resetting door, and r is resetting door.Substantially On, the two gate vectors determine which information finally can be as the output of gating cycle unit.The two door control mechanisms It is characterized in that, they can save the information in long-term sequence, and will not remove or because with prediction not phase at any time It closes and removes.

For the attention mechanism module by introducing source language sentence corresponding to the machine translation translation to described The machine translation text encoding submodule that machine translation translation is encoded；Wherein, attention mechanism (attention It mechanism be) in the essence in machine translation is that the word alignment between source language and the target language is set up in the distribution of weight Relationship.Due to the usual quality of machine translation and imperfect, the information for being included is not fully accurate, and therefore, the present embodiment introduces The attention mechanism module of source language sentence can supplement the information lost in machine translation translation with mistake.

Include in the machine translation text encoding submodule：

For inputting the input module of machine translation sentence and original language sequence vector；

For utilizing condition door recirculating network unit (cGRU：Conditional Gated Recurrent Unit) it calculates The vector calculation module of each vector in the sequence vector.

Wherein, due to machine translation system generate translation it is general and imperfect, expressed by information it is not accurate enough, because This such as encodes machine translation still according to the above-mentioned method encoded to original language, then in an encoding process It can include certain noise.And the expressed information in source language sentence is typically considered accurately, therefore to machine During translation is encoded, the attention of the source language sentence corresponding to it is introduced, it in this way can be more accurately to machine Translation is encoded, and coding accuracy is effectively improved.The attention mechanism module structure is as shown in figure 3, the source language sentence Son attention mechanism module include：

Coding input module (the h of source language sentence_i)；

Attention vector generation module (c_j)；

The fusing and decoding module uses a kind of cGRUmutil-att unit of more attentions, the cGRUmutil-att Unit, which allows to input multiple context vectors and these vectors link together, constitutes a new context vector.Input is The sequence vector of the sequence vector of source language sentence and its multiple machine translations, final output are fused translations, specific Structure it is as shown in Figure 4.CGRU unit is cascade two GRU units.And cGRUmutil-att unit is in cGRU unit On the basis of allow multiple attention force vectors to be connected together as the inputs of unit.

Advantage using the cGRUmutil-att unit of more attentions is, can be by can letter in machine translation translation Breath is united, and influences the generation of final translation, is turned over so that the translation ultimately generated can combine a machine Information in translation.CGRUmutil-att can effectively calculate in machine translation the position of each word and final simultaneously The alignment relation of the position of the word of translation.

The vocabulary in fixed size theoretically is used when decoding based on softmax, that is to say, that is allowed to generate to participate in and be melted Close the word except translation vocabulary.And in practice, using participation when classical more translations fusion task decodes under normal circumstances Word in the translation of fusion is as dynamic vocabulary.Therefore, it is used in the present embodiment and decoded search space is limited. The restrictive condition of the search space of the fusing and decoding module is as follows：

The advantage of above-mentioned restrictive condition is on the one hand to can solve the unregistered word problem in decoding, because even translation There are unregistered words, can also find corresponding word in the former candidate translation for participating in fusion.On the other hand, nerve can be alleviated The problem of excessive translation and translation deficiency in network decoding, effectively improves the accuracy and translation efficiency of translation.

A kind of emerging system of the machine translation system based on deep learning proposed in the present embodiment inputs machine translation The sequence vector of sentence and its original language exports as the sequence vector of machine translation.Utilize the attention of source language sentence (Attention) mechanism solves original language Dependence Problem, and with condition gating cycle network unit (Conditional Gated Recurrent Unit, cGRU) carry out each vector in the sequence of calculation, influence of the noise to coding is effectively eliminated, is had Effect improves coding accuracy.A kind of emerging system for machine translation system based on deep learning that the present embodiment proposes solves The problem of unregistered word in decoding influences translation accuracy, effectively improves the accuracy of translation.Alleviate neural network solution again The problem of excessive translation and translation deficiency in code, further improves the accuracy and translation efficiency of translation process, effectively mentions The performance of high emerging system.

Embodiment 2

A kind of fusion method of the machine translation system based on deep learning, the technical solution taken are as follows：

The fusion method includes：

For carrying out the model parameter training step of parameter learning to the fusing and decoding step；The model training step Model foundation step including translation model of the building one from original language to object language on the basis of master mould；

For exporting the output step of translation after the fusion.

The coding step includes：

For the source language sentence to be encoded into the source of the sequence of a vector using bidirectional circulating neural network structure Language sentence coding sub-step；The bidirectional circulating neural network structure uses GRU structure.

Include in the machine translation text coding sub-step：

For inputting the input step of machine translation sentence and original language sequence vector；

For exporting the output step of the sequence vector of machine translation；

For utilizing condition door recirculating network unit (cGRU：Conditional Gated Recurrent Unit) it calculates The vector calculation step of each vector in the sequence vector.

Coding input step (the h of source language sentence_i)；

Pay attention to force vector generation step (c_j)；

The restrictive condition of the search space of the fusing and decoding step is as follows：

The experimental verification process of the emerging system of machine translation system of the present invention based on deep learning and method is such as Under：

For the validity of verification method, the selection of our data needs while meeting following condition：1) first The data that stage uses are the disclosed bilingual datas that can be obtained, and scale is large enough to train neural network translation mould Type, and there is good performance；2) second stage uses disclosed training data, and scale is smaller, because in practical application Scene can not obtain large-scale machines translation result in real time.3) training data of second stage and final test collection will include source language Speech, the translation of reference translation and multiple black box machine translation systems.Therefore, the training set of model first stage we use For the Czech of the Europarl_v7 of WMT publication to the bilingual data collection conduct of English, which includes about 645K bilingual sentences Machine translation translation Fusion training collection right, that the training set of model second stage is provided using WMT2011 translation fusion task, The translation knot of 8 black box machine translation systems of 1003 sentence pair of bilingual data and each sentence pair including Czech to English Fruit.The test set that test set is provided using WMT2011 translation fusion task comprising the bilingual data 2000 of Czech to English Sentence pair and each sentence pair merge the training set of task corresponding 8 black box machine translation system with front WMT2011 translation The translation result of system, without loss of generality, we have randomly selected the translation result of 4 translation systems, the translation of this four systems BLEU-4 score, as shown in table 1.

The translation BLEU score of each machine translation system of table 1

System identifier	BLEU-4
		System A	19.65
System B	21.58
		System C	22.00
System D	23.53

Specific experiment result is as follows：

We have selected the classical system globe area model (MEMT) based on ballot as our baseline model, should Method repeatedly does well in performance in multiple language tasks of the system globe area of WMT.The BLEU of each system globe area method Score is as shown in table 2

The BLEU score of each system globe area method of table 2

Fusion method title	BLEU-4
		System top performance	23.53
Baseline emerging system MEMT performance	24.07
		Context of methods：	24.95
Context of methods (does not use decoding constraint)	24.74
		Context of methods (without using original language, does not use decoding constraint)	24.32

It can be concluded that, the performance of system globe area model neural network based is significantly better than classical from experimental result The performance of baseline Fusion Model MEMT, reason essentially consist in the method based on depth integration can more effectively retain it is to be fused The information of machine translation, these information influence each other in neural network, can more effectively merge to translation.It merges simultaneously Method after original language information relative to no fusion original language information improves 0.42 BLEU score.This shows source language Speech still is able to bring more effective informations in last fusing and decoding, to help the decoding merged.Decoding constraint is used Model can be made further to improve 0.21 BLEU score.This shows the feature in classical model, and still some are largely effective Information, do not excavated by depth integration model, thus could after fusion further lift scheme performance.

Although the present invention has been disclosed in the preferred embodiment as above, it is not intended to limit the invention, any to be familiar with this The people of technology can do various changes and modification, therefore protection of the invention without departing from the spirit and scope of the present invention Range should subject to the definition of the claims.

Claims

1. a kind of emerging system of the machine translation system based on deep learning, which is characterized in that the emerging system includes：

For inputting the input module of the n machine translation translation of 1 source language sentence and the source language sentence, wherein n is Integer greater than 2；

Vector for exporting the coding module is merged the fusing and decoding module for being decoded into sentence again, wherein institute Stating the sentence that fusion is decoded into again is translation after merging；

For carrying out the model parameter training module of parameter learning to the fusing and decoding module；The model training module includes The model building module of translation model of the building one from original language to object language on the basis of original encoding-decoded model；

For exporting the output module of translation after the fusion.

2. the emerging system of the machine translation system based on deep learning according to claim 1, which is characterized in that the volume Code module include：

For the source language sentence to be encoded into the original language of the sequence of a vector using bidirectional circulating neural network structure Sentence encoding submodule；

For the attention mechanism module by introducing source language sentence corresponding to the machine translation translation to the machine The machine translation text encoding submodule that translation translation is encoded；

Wherein：

Include in the machine translation text encoding submodule：

For calculating the vector calculation module of each vector in the sequence vector using condition door recirculating network unit.

The attention mechanism module of the source language sentence includes：

The coding input module of source language sentence；

The weight calculation module of the coding of source language sentence；

Attention vector generation module.

3. the emerging system of the machine translation system based on deep learning according to claim 2, which is characterized in that described double GRU structure is used to Recognition with Recurrent Neural Network structure.

4. the emerging system of the machine translation system based on deep learning according to claim 1, which is characterized in that described to melt Close the cGRUmutil-att unit that decoder module includes attention more than one.

5. the emerging system of the machine translation system based on deep learning according to claim 1, which is characterized in that described to melt The restrictive condition for closing the search space of decoder module is as follows：

The vocabulary of final translation is the union for merging translation candidate vocabulary, that is, it must be candidate in fusion translation for decoding the word of generation The word of middle appearance；

The original word order that fusion translation is kept when decoding, that is, require successively to select from left to right word in translation candidate as The candidate of the word of final translation；

The word that some word is aligned if being used with it cannot be used again, i.e., the word of all alignment can only translated finally Occur in literary candidate primary.

6. a kind of fusion method of the machine translation system based on deep learning, which is characterized in that the method includes：

For inputting the input step of the n machine translation translation of 1 source language sentence and the source language sentence, wherein n is Integer greater than 2；

Vector for exporting the coding step is merged the fusing and decoding step for being decoded into sentence again, wherein institute Stating the sentence that fusion is decoded into again is translation after merging；

For carrying out the model parameter training step of parameter learning to the fusing and decoding step；The model training step includes The model foundation step of translation model of the building one from original language to object language on the basis of master mould；

For exporting the output step of translation after the fusion.

7. the fusion method of machine translation system according to claim 1, which is characterized in that the coding step includes：

For the source language sentence to be encoded into the original language of the sequence of a vector using bidirectional circulating neural network structure Sentence coding sub-step；

For the attention mechanism step by introducing source language sentence corresponding to the machine translation translation to the machine The machine translation text coding sub-step that translation translation is encoded；

Wherein：

Include in the machine translation text coding sub-step：

For exporting the output step of the sequence vector of machine translation；

For calculating the vector calculation step of each vector in the sequence vector using condition door recirculating network unit.

The attention mechanism step of the source language sentence includes：

The coding input step of source language sentence；

The weight calculation step of the coding of source language sentence；

Pay attention to force vector generation step.

8. the fusion method of machine translation system according to claim 2, which is characterized in that the bidirectional circulating neural network Structure uses GRU structure.

9. the fusion method of machine translation system according to claim 1, which is characterized in that the fusing and decoding step includes The cGRUmutil-att unit of attention more than one.

10. the fusion method of machine translation system according to claim 1, which is characterized in that the fusing and decoding step The restrictive condition of search space is as follows：