CN117273027B

CN117273027B - Automatic machine translation post-verification method based on translation error correction

Info

Publication number: CN117273027B
Application number: CN202311559241.5A
Authority: CN
Inventors: 白雪峰; 朱宪超; 张岳; 霍展羽
Original assignee: Sichuan Lan Bridge Information Technology Co ltd; Westlake University
Current assignee: Sichuan Lan Bridge Information Technology Co ltd; Westlake University
Priority date: 2023-11-22
Filing date: 2023-11-22
Publication date: 2024-04-30
Anticipated expiration: 2043-11-22
Also published as: CN117273027A

Abstract

The invention belongs to the technical field of machine translation, and provides a machine translation automatic post-verification method based on translation error correction, wherein the method comprises the following steps: (1) Training a neural network model of the dual-to-edit sequence by adopting fine-granularity machine translated verification data as training data to obtain a neural network conversion model of the dual-to-edit sequence; (2) Based on the neural network conversion model of the dual-to-edit sequence obtained in the step (1), taking a translation original text S and a machine translation text MT as inputs, and outputting a machine translation error sequence E corresponding to the machine translation text; (3) According to the machine translation error sequence E, the machine translation MT is converted into a target translation. According to the invention, the fine-granularity translation error type is introduced as additional information, so that on one hand, the sparsity of the training data of the input end can be improved, and on the other hand, compared with a text, the sparse training data is shorter and the spatial distribution is denser, thereby relieving the dependence of the model on the large-scale training data.

Description

Automatic machine translation post-verification method based on translation error correction

Technical Field

The invention belongs to the technical field of machine translation, and particularly relates to a machine translation automatic post-verification method based on translation error correction.

Background

Automatic post-verification of machine translation (Machine Translation Automatic Post Editing, MTAPE for short) has received a growing interest in the research field in recent years. Generally, the automatic post-verification system for machine translation takes the translation result of the machine translation system as input, automatically corrects errors in translation, and outputs accurate and smooth translations which are more in line with the habit of human writing. The existing MTAPE system takes a triplet formed by (original text, machine translation and post-verification translation) as training data, performs parameter updating of a neural network through sentence-level end-to-end training, and obtains a MTAPE model after training convergence. In practical application, the MTAPE system and the machine translation system are isolated from each other, and the output of the machine translation system is optimized in a pipeline mode, so that a translation text with higher quality is finally obtained.

Although having the advantages of simple data form and easy training, the prior MTAPE system has the following disadvantages due to the sentence-to-sentence training mode:

(1) The training data of the triples takes sentences as basic units, so that the training data is very sparse, and the model is required to be trained by depending on a large amount of data, which is not practical in a real scene;

(1) The automatic post-verification of machine translation usually does not carry out continuous translation modification in a large range, but adjusts in a plurality of small ranges, so that the existing sentence level training mode can cause that a large amount of training signals are wasted in nonsensical translation copies by the model, and the learning difficulty of the model is increased;

(3) The existing end-to-end verification mode exists in a black box mode, and a human cannot understand the internal judgment basis of the system, so that when an error occurs in the automatic machine translation post-verification system, the system accuracy cannot be improved by aiming at the error design correction scheme.

Disclosure of Invention

The invention aims to provide a machine translation automatic post-verification method based on translation error correction, which aims to solve the technical problems existing in the prior art.

In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:

A machine translation automatic post-verification method based on translation error correction, comprising:

(1) Training a neural network model of the dual-to-edit sequence by adopting fine-granularity machine translated verification data as training data to obtain a neural network conversion model of the dual-to-edit sequence;

(2) Based on the neural network conversion model of the dual-to-edit sequence obtained in the step (1) to translate the original text And machine translation text/>As input, output machine translation error sequence/>, corresponding to machine translation text；

(3) According to the sequence of machine turning errorsTranslation of machine text/>Converting into a target translation;

the fine-granularity machine translated verification data is obtained by manually marking translation errors of phrase levels on a machine translation result.

In one embodiment, the fine-grained machine post-translational verification data is synthesized by manual annotationMachine translation error sequence/>, obtained by three sequences，/>Wherein/>Representing the nth machine turning error,/>Representing the length of the machine-turning error sequence;

Which represents the translation context; /(I) Which represents the machine translation; /(I)Which represents a post-verification translation, wherein/>Representing the translation origin, the machine translation and the post-verification translation, j,Words,/>Respectively represent/>The length of the three sequences.

In one embodiment, the machine translation error sequenceIs represented by a quadruple: mechanically turning the phrase sequence, the modified sequence, the original text corresponding sequence and the error type; the error types include mistranslations, missed translations, transitional translations, professional vocabulary errors, grammar errors, and/or fluency errors.

In one embodiment, the neural network model of (1) coupled to the edit sequence comprises an textual encoderMachine translation,/>And decoder/>; Wherein, original text encoder/>Machine translation,/>Respectively used for coding translation original text/>And machine translation text/>Decoder/>With textual encoder/>Machine translation,/>Takes as input the output hidden state of (2) to generate the machine translation error sequence/>, in an autoregressive manner。

In one embodiment, the textual encoderMachine translation,/>Respectively used for coding translation original text/>And machine translation text/>Converting it into a corresponding hidden state vector:

Wherein/> Is composed of m layers of neural network coding layers, wherein/>Is composed of n nerve network coding layers.

In one implementation, the decoderIn translation of textual hidden states/>Mechanically translated text hidden state/>And machine translation error sequences that have been generated/>Predicting machine translation error sequence/>, at current time t：

Wherein W is a word vector matrix of the decoder output layer,/>Is formed by stacking k decoding layers and one output layer.

In one embodiment, the decoding layer includes three sub-networks: context means network, textual attention network and machine translation textual mixed attention network;

The context represents that the network updates the current state according to the generated vector;

The textual attention network uses a multi-headed attention mechanism to update the hidden state vector at the current time: wherein/> Representing the hidden state of translation origin,/>Representing a hidden state vector at the current moment;

the machine translation text mixed attention network updates the hidden state vector of the current moment with the original text attention network Mechanically translated text hidden state/>For input, updated state vector is output

Wherein/>Is a mixed coefficient for measuring the importance of a machine translation to a decoding model, and the/>The calculation mode of (2) is as follows:

wherein/> Representing a sigmoid activation function, W and b are model parameters,/>Is the initial mixing coefficient, d represents the input state vector/>Is a dimension of (c).

In one embodiment, the (1) further comprises a given instance input (S, MT, E), constructing an optimized loss function for model training:

Wherein/> For the initial character,/>To translate the length of the error sequence E,/>For translating textual hidden state vectors,/>For machine translation hidden state vector,/>Is an already generated machine translation error sequence.

In one embodiment, the neural network conversion model formulation for the dual-to-edit sequence in (1) is formulated as follows:

Wherein/> Representing a set of all training set instances, employing the fine-grained machine post-translational verification data,/>Representing translation originals,/>Representing machine translation,/>Representing a machine translation error sequence,/>Is a model parameter.

In one embodiment, the specific method of (3) is as follows: according to the sequence of machine turning errorsThe error type, the position information and the text editing method based on the error type, and the automatic translation restoration is realized by correcting the machine translation through deleting and/or inserting text editing actions.

In order to achieve the above object, the present invention further provides a machine translation automatic post-verification system, including:

The model building module is used for training a neural network model of the dual-to-edit sequence by adopting fine-granularity machine translated verification data as training data to obtain a neural network conversion model of the dual-to-edit sequence;

The machine-turning error prediction module is used for translating the original text based on a neural network conversion model of a dual-to-edit sequence And machine translation text/>As input, output machine translation error sequence/>, corresponding to machine translation text；

Automatic machine translation text restoration module based on machine turning error sequenceTranslation of machine text/>Converting into a target translation;

To achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a computer program to be executed by a processor to implement the machine translation automatic post-verification method based on translation error correction as described above.

In order to achieve the above object, the present invention further provides a machine translation automatic post-verification device, including: a processor and a memory;

The memory is used for storing a computer program;

The processor is connected to the memory and is configured to execute the computer program stored in the memory, so that the automatic post-machine-translation checking device performs the automatic post-machine-translation checking method based on the correction of the translation error as described above.

Compared with the prior art, the invention has the following beneficial effects:

According to the invention, the fine-granularity translation error type is introduced as additional information, and the supervision signal is divided into phrase levels from sentence levels by the fine-granularity translation error label, so that the sparsity of the training data of the input end can be improved; on the other hand, the invention uses a translation error sequence as output, the sequence is shorter than the text, and the space distribution is more dense, thereby relieving the dependence of the model on large-scale training data.

The invention adopts the translation error labeling, so that the model can pay more attention to the sequences with errors, and the interference of the sequences without errors on the model is reduced, thereby reducing the learning difficulty of the model.

The invention outputs the checked translation and the translation error of the corresponding segment, is beneficial to human understanding of the internal judgment basis of the machine translation, and can construct training data according to the corresponding error type to perform model optimization, thereby improving automatic post-check of the machine translation.

Drawings

Fig. 1 is a schematic diagram of the principle of the present invention-example 1.

Fig. 2 is a schematic diagram of a neural network model coupled to an edit sequence in example 1 of the present invention.

FIG. 3 is a schematic diagram of the system of the present invention, which is a machine translation automatic post-verification based on translation error correction in embodiment 2.

Fig. 4 is a schematic structural view of the device disclosed in the embodiment 4 of the present invention.

Description of the embodiments

The present invention will be further described in detail with reference to examples so as to enable those skilled in the art to more clearly understand and understand the present invention. It should be understood that the following specific embodiments are only for explaining the present invention, and it is convenient to understand that the technical solutions provided by the present invention are not limited to the technical solutions provided by the following embodiments, and the technical solutions provided by the embodiments should not limit the protection scope of the present invention.

It should be noted that the illustrations provided in the following embodiments merely illustrate the basic concept of the present invention by way of illustration, so that only the components related to the present invention are shown in the drawings and are not drawn according to the number, shape and size of the components in actual implementation, the form, number and proportion of each component in actual implementation may be arbitrarily changed, and the layout of the components may be more complicated.

Example 1

As shown in fig. 1 and 2, this embodiment provides a machine translation automatic post-verification method based on translation error correction, which is different from the existing whole sentence verification method, and adopts a fine-granularity labeling corpus to train a machine translation post-verification model, specifically: firstly, performing phrase-level translation error labeling on a machine-turning result to obtain fine-granularity machine-translated check corpus, in the model training process, predicting a machine-turning error sequence instead of directly predicting a target translation by a model, and finally, converting the machine-translated text into the target translation according to the machine-turning error sequence, wherein the specific steps are as follows: 1. training a neural network model of the dual-to-edit sequence by adopting fine-granularity machine translated verification data as training data to obtain a neural network conversion model of the dual-to-edit sequence; 2. neural network conversion model based on obtained pair-to-edit sequence to translate original textAnd machine translation text/>As input, outputting machine translation error sequence corresponding to machine translation text; 3. According to the machine-turning error sequence/>Translation of machine text/>And converting into target translation.

The fine-grained machine post-translation verification data is synthesized by manual labelingMachine translation error sequence/>, obtained by three sequencesWherein/>Representing the nth machine turning error,/>Representing the length of the machine-turning error sequence; Which represents the translation context; /(I) Which represents the machine translation; /(I)Which represents a post-verification translation, wherein/>Representing the translation origin, the machine translation and the post-verification translation respectivelyWords,/>Respectively represent/>The length of the three sequences.

In this embodiment, the error sequence is machine translatedIs represented by a quadruple:

Types. As shown in Table one, error types include misinterpretation, miss-interpretation, transitional interpretation, professional vocabulary errors, grammar errors, and/or fluency errors; phrases include the translation, definition, and/or context of the phrase; the foregoing "including" means "including but not limited to"; for example: given the translation textual input "do you like to swim in winter? "the machine translation system outputs" Do you like to run IN THE WINDY WINTER "and the machine translated verification target" Do you like to SWIM IN WINTER.

Table one: machine translation error type

In a further preferred scheme, in the process of labeling the machine-turning error sequence, the embodiment adopts a left-to-right labeling scheme, namely, preferentially labeling the machine-turning sequence with the starting point being far to the left; for different potential labeling sequences, the embodiment preferentially selects the shortest one for labeling.

In a further preferred embodiment, to avoid the problem of overlong output sequences due to phrases, the error sequence is machine translatedThe machine-flipped phrase sequence and the original text corresponding sequence in (a) are each further represented as a tuple/>Wherein/>The start and end subscripts of the sequence are indicated, respectively. For example: run and swim in < run, swim, transliteration > will be replaced with/>。

The neural network model coupled to the edit sequence in this embodiment includes an original encoderAnd decoder/>; Wherein, original text encoder/>Respectively used for coding translation original text/>Translation of text by harmony machineDecoder/>With textual encoder/>Takes as input the output hidden state of (2) to generate the machine translation error sequence/>, in an autoregressive manner。

Specifically, given a source language inputAnd machine translation output/>Original encoder/>Respectively used for coding translation original text/>And machine translation text/>Converting it into a corresponding hidden state vector:

Wherein/> Is composed of m layers of neural network coding layers, wherein/>The method comprises n layers of neural network coding layers; multiple sequence coding models such as Transformer, CNN, RNN can be used.

DecoderIn translation of textual hidden states/> And machine translation error sequences that have been generated/>For input, predicting the machine translation error sequence/>, at the current time t：

Wherein W is a word vector matrix of the decoder output layer,/>The decoding layer is formed by stacking k decoding layers and one output layer, and each decoding layer comprises three sub-networks: the context represents a network, an original text attention network and a machine translation text mixed attention network, and the working principle of each sub-network is as follows:

(1) The context means that the network updates the current state according to the generated vector;

(2) The textual Attention network uses a Multi-Head Attention mechanism (Multi-Head Attention) to update the hidden state vector at the current time: wherein/> Representing the hidden state of translation origin,/>Representing a hidden state vector at the current moment;

(3) Machine translation of hidden state vector at current moment of mixed attention network updated with original attention network Mechanically translated text hidden state/>For input, updated state vector is output

Wherein/>Is a mixed coefficient for measuring the importance degree of the machine translation on the decoding model, and the mixed coefficient can be used for relieving the translation error problem brought by a machine translation system,/>The specific calculation mode of (2) is as follows: /(I)Wherein/>Representing a sigmoid activation function, W and b are model parameters,/>Is the initial mixing coefficient, d represents the input state vector/>Is a dimension of (c).

To sum up, given the instance inputs (S, MT, E), an optimized loss function for model training is constructed:

Wherein/> For the initial character,/>To translate the length of the error sequence E,/>For translating textual hidden state vectors,/>For machine translation hidden state vector,/>The machine translation error sequence that has been generated.

Training the neural network model of the dual-to-edit sequence according to the constructed optimized loss function, training the neural network model in small batches, and training the neural network model by using an optimizer (such as Adam, SGD and the like) based on random gradient descent.

After model training, a neural network conversion model formula of the dual-to-edit sequence is obtained as follows:

Wherein/> Representing a set of all training set instances, which employs fine-grained machine post-translational verification data (whereby training data may be constructed for model optimization based on corresponding error types),Representing translation originals,/>Representing machine translation,/>Representing a machine translation error sequence,/>Is a model parameter.

After model training, the original text is translatedAnd machine translation text/>As input, output machine translation error sequence/>, corresponding to machine translation text(Machine-turning error sequence/>)Each term is represented by a quadruple:

Type), finally, according to the machine-flipped error sequence/> The error type, the position information and the text editing method based on the error type, and the automatic translation restoration is realized by correcting the machine translation through deleting and/or inserting text editing actions.

As shown in table two, the text editing method based on the error type is as follows: if the error type is mistranslation, excessive translation, grammar error, professional vocabulary error and fluency error, deleting the wrong translation and inserting the correct translation; if the error type is miss-translated, then the correct translation is inserted.

And (II) table: error type and corresponding editing operation

It should be noted that in the actual verification process (based on the real data), some error often needs to be modified multiple times, and in this case, the operations of "delete" and "insert" need to be called multiple times to correct the translation.

Example 2

As shown in fig. 3, the present embodiment provides a machine translation automatic post-verification system, which includes:

The model building module is used for training a neural network model of the dual-to-edit sequence by adopting fine-granularity machine translated verification data as training data to obtain a neural network conversion model of the dual-to-edit sequence; the fine-granularity machine translated verification data is obtained by manually marking translation errors of phrase levels on a machine translation result;

Automatic machine translation text restoration module based on machine turning error sequenceTranslation of machine text/>Conversion to the target translation, wherein the machine-flipped error sequence/>Each term is represented by a quadruple: /(I)Types.

It should be noted that the structure and/or principle of each module corresponds to the content of the automatic post-verification method for machine translation based on translation error correction described in embodiment 1, and thus will not be described herein.

It should be noted that, it should be understood that the division of each module of the above system is merely a division of a logic function, and may be fully or partially integrated into a physical entity in actual implementation, or may be physically separated, and the modules may be fully implemented in a form of software called by a processing element; or can be realized in hardware; the method can also be realized in a form of calling software by a processing element, and the method can be realized in a form of hardware by a part of modules. For example, a module may be a processing element that is set up separately, may be implemented in a chip of an apparatus, may be stored in a memory of the apparatus in the form of program codes, may be called by a processing element of the apparatus and perform functions of a module, and may be implemented similarly. In addition, all or part of the modules can be integrated together or can be independently implemented. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in a software form.

For example, the modules above may be one or more integrated circuits configured to implement the methods above, such as: one or more specific integrated circuits, or one or more microprocessors, or one or more field programmable gate arrays, etc. For another example, when a module above is implemented in the form of a processing element scheduler code, the processing element may be a general purpose processor, such as a central processing unit or other processor that may invoke the program code. For another example, the modules may be integrated together and implemented in a system-on-chip form.

Example 3

The present embodiment provides a computer-readable storage medium having stored thereon a computer program that is executed by a processor to implement the machine translation automatic post-verification method based on translation error correction provided in embodiment 1. Those skilled in the art will appreciate that: all or part of the steps of implementing the method provided in embodiment 1 may be implemented by hardware associated with a computer program, where the computer program may be stored in a computer readable storage medium, and when executed, the program performs steps including the method provided in embodiment 1; and the storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.

Example 4

As shown in fig. 4, this embodiment provides a machine translation automatic post-verification device, including: a processor and a memory; the memory is used for storing a computer program; the processor is connected to the memory, and is configured to execute the computer program stored in the memory, so that the automatic post-machine-translation verification device performs the automatic post-machine-translation verification method based on translation error correction described in embodiment 1.

Specifically, the memory includes: various media capable of storing program codes, such as ROM, RAM, magnetic disk, U-disk, memory card, or optical disk.

Preferably, the processor may be a general-purpose processor, including a central processor, a network processor, etc.; but also digital signal processors, application specific integrated circuits, field programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

The foregoing is a preferred embodiment of the present invention. It should be noted that those skilled in the art may make several modifications without departing from the design principles and technical solutions of the present invention, and these modifications should also be considered as the protection scope of the present invention.

Claims

1. A machine translation automatic post-verification method based on translation error correction, comprising:

(2) Based on the neural network conversion model of the dual-to-edit sequence obtained in the step (1), taking a translation original text S and a machine translation text MT as inputs, and outputting a machine translation error sequence E corresponding to the machine translation text;

(3) According to the machine translation error sequence E, converting the machine translation MT into a target translation;

the fine-granularity machine translated verification data is obtained by manually marking translation errors of phrase levels on a machine translation result;

The fine-grained machine translated verification data is a machine translated error sequence E ', E' = { E ₁,e₂,…,e_|E| }, which is obtained by manually labeling three sequences of the comprehensive S ', MT' and PE, wherein E _n represents the nth machine turning error, and E| represents the length of the machine turning error sequence;

s' = { x ₁,x₂,…,x_|S| }, which represents the translation context; which represents the machine translation; PE= { y ₁,y₂,…,y_|PE| }, which represents a post-verification translation, where x _i,/> Y _k represents the i, j, k words in the translation original text, the machine translation text and the post-verification translation text respectively, |S|, |MT|, |PE| represents the lengths of three sequences of S ', MT', PE respectively;

each term in the machine translation error sequence E' is represented by a quadruple: mechanically turning the phrase sequence, the modified sequence, the original text corresponding sequence and the error type; the error types include mistranslations, missed translations, transitional translations, professional vocabulary errors, grammar errors, and/or fluency errors.

2. The automatic post-verification method for machine translation based on translation error correction according to claim 1, wherein: the neural network model coupled to the editing sequence in the step (1) comprises an original text encoder Enc _S, a machine translation Enc _MT and a decoder Dec _E; the original encoder Enc _S and the machine translation Enc _MT are used for encoding the translation original S and the machine translation MT, respectively, and the decoder Dec _E takes the output hidden states of the original encoder Enc _S and the machine translation Enc _MT as inputs to generate the machine translation error sequence E in an autoregressive manner.

3. The automatic post-translation verification method based on translation error correction according to claim 2, wherein the original encoder Enc _S and the machine translation Enc _MT are respectively used for encoding the translation original S and the machine translation MT, and converting the translation original S and the machine translation MT into corresponding hidden state vectors:

H^S＝Enc_S(S)

H^MT＝Enc_MT(MT)

wherein Enc _S is comprised of m neural network encoded layers, wherein Enc _MT is comprised of n neural network encoded layers.

4. The automatic post-translational error correction based machine translation verification method of claim 3, wherein the decoder Dec _E takes as input a translation context hidden state H ^S, a machine translation context hidden state H ^MT, and an already generated machine translation error sequence E _＜t, predicts a machine translation error sequence E _t at a current time t:

h_t＝Dec_E(H^s,H^MT,E_＜t)

E_t＝Softmax(h_t，W)

Wherein, W is the word vector matrix of the decoder output layer, and Dec _E is formed by stacking k decoding layers and one output layer.

5. The automatic post-translation verification based on translation error correction method according to claim 4, wherein said decoding layer comprises three sub-networks: context means network, textual attention network and machine translation textual mixed attention network;

The textual attention network uses a multi-headed attention mechanism to update the hidden state vector at the current time: h "_t＝MultiHead(h'_t,H^S,H^S), wherein H ^S represents a translation original hidden state, and H' _t represents a current-moment hidden state vector;

The machine translation text mixed attention network takes a hidden state vector H '_t at the current moment updated by the original text attention network and a machine translation text hidden state H ^MT as inputs, and outputs an updated state vector H' _t: Wherein λ is a mixed coefficient for measuring importance of the machine translation on the decoding model, and the λ is calculated as follows:

λ＝λ₀·σ[(H(A)+H(A^T))W+n]，

Where σ represents the sigmoid activation function, W and b are model parameters, λ ₀ is the initial mixing coefficient, and d represents the dimension of the input state vector h "_t.

6. The automatic post-verification of machine translation based on translation error correction of claim 5, wherein (1) further comprises a given instance input (S, MT, E), constructing a model trained optimization loss function:

wherein E ₀ is the initial character, |E| is the length of the translation error sequence E, H ^S is the translation hidden state vector, H ^MT is the machine translation hidden state vector, and E _<t is the machine translation error sequence which has been generated.

7. The automatic post-verification method for machine translation based on translation error correction according to claim 6, wherein the neural network conversion model formula of the dual-to-edit sequence in (1) is expressed as follows:

Wherein D represents a set formed by all training set examples, the training set adopts the fine-granularity machine post-translation verification data, S represents a translation original text, MT represents a machine translation text, E represents a machine translation error sequence, and θ is a model parameter.

8. The automatic post-verification method for machine translation based on translation error correction according to claim 7, wherein the specific method of (3) is as follows: according to the error type and the position information provided in the machine-turning error sequence E and a text editing method based on the error type, correcting the machine-translated text through deleting and/or inserting text editing actions, and realizing automatic translation restoration.