CN109635269B - Post-translation editing method and device for machine translation text - Google Patents
Post-translation editing method and device for machine translation text Download PDFInfo
- Publication number
- CN109635269B CN109635269B CN201910079518.1A CN201910079518A CN109635269B CN 109635269 B CN109635269 B CN 109635269B CN 201910079518 A CN201910079518 A CN 201910079518A CN 109635269 B CN109635269 B CN 109635269B
- Authority
- CN
- China
- Prior art keywords
- text
- translated
- vector
- machine
- translation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013519 translation Methods 0.000 title claims abstract description 181
- 238000000034 method Methods 0.000 title claims abstract description 46
- 239000013598 vector Substances 0.000 claims abstract description 113
- 238000012545 processing Methods 0.000 claims abstract description 81
- 230000007246 mechanism Effects 0.000 claims abstract description 74
- 238000013528 artificial neural network Methods 0.000 claims abstract description 49
- 230000006870 function Effects 0.000 claims description 38
- 238000004590 computer program Methods 0.000 claims description 9
- 230000001323 posttranslational effect Effects 0.000 claims description 7
- 238000005457 optimization Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 230000000694 effects Effects 0.000 abstract description 2
- 238000012512 characterization method Methods 0.000 description 7
- 239000000284 extract Substances 0.000 description 7
- 230000008569 process Effects 0.000 description 6
- 238000012549 training Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 230000000306 recurrent effect Effects 0.000 description 3
- 230000004913 activation Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000035807 sensation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a post-translation editing method of machine translation text, which comprises the following steps: acquiring a source text and a machine translation text; extracting first text features of the source text through a self-attention mechanism, and processing the first text features by utilizing a feedforward neural network to obtain a first vector representing the source text; extracting a second text feature of the machine translated text by a self-attention mechanism, optimizing the second text feature by using the attention mechanism on the first vector; processing the optimized second text features by using a feedforward neural network to obtain a second vector representing the machine translation text; and generating translated edited text of the machine-translated text word by word from left to right according to the first vector and the second vector. The method can improve the processing efficiency and the accuracy of the post-translation editing, so that the accuracy of the post-translation editing text obtained by processing is better. The post-translation editing device, the post-translation editing equipment and the readable storage medium for the machine translation text have the same technical effects.
Description
Technical Field
The present invention relates to the field of automatic text translation technology, and more particularly, to a method, apparatus, device and readable storage medium for post-translation editing of machine translated text.
Background
Machine translation, also known as automatic translation, is a process of converting one natural source language into another natural target language using a computer, and generally refers to translation of sentences and text between natural languages. Correspondingly, machine translation text refers to translating text in one language by a computer, and the text in the other language is obtained. Post-translation editing refers to the process of perfecting machine-generated translated text so that the machine-translated text better conforms to the human language style.
In the prior art, automatic processing of post-compilation editing is generally achieved based on a recurrent neural network. It should be noted that, features of language texts extracted by the recurrent neural network are not fine enough, and features between the source text and the machine translation text cannot be associated by utilizing log-linear combination to process the source text and the machine translation text, so that the characterization capability of the source text and the machine translation text is insufficient, and therefore, the accuracy of post-translation editing is reduced, and the accuracy of post-translation editing texts obtained by post-translation editing is reduced. The translated text is the text obtained after the machine translation text is subjected to the post-translation editing processing.
Therefore, how to improve the accuracy of post-translation editing is a problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The invention aims to provide a method, a device and equipment for post-translation editing of machine translation text and a readable storage medium, so as to improve the accuracy of post-translation editing.
In order to achieve the above purpose, the embodiment of the present invention provides the following technical solutions:
a post-translation editing method of machine-translated text, comprising:
acquiring a source text and a machine translation text of the source text;
extracting first text features of the source text through a self-attention mechanism, and processing the first text features by utilizing a feedforward neural network to obtain a first vector representing the source text;
extracting a second text feature of the machine translated text by a self-attention mechanism, the second text feature being optimized by using an attention mechanism on the first vector; processing the optimized second text feature by using a feedforward neural network to obtain a second vector representing the machine translation text;
and generating translated edited text of the machine translation text word by word from left to right according to the first vector and the second vector.
The method for extracting the first text feature of the source text through the self-attention mechanism, and processing the first text feature by utilizing a feedforward neural network to obtain a first vector representing the source text comprises the following steps:
processing the source text through a residual neural network to obtain the first vector;
wherein each network layer in the residual neural network is comprised of a self-attention machine sublayer and a feed-forward neural network sublayer.
Wherein said optimizing said second text feature by using an attention mechanism on said first vector comprises:
optimizing the second text feature according to an attention mechanism processing formula, wherein the attention mechanism processing formula is as follows:
wherein Q represents a query term in the second text feature; k, V represents a pair of key values.
Wherein the generating the translated compiled text of the machine translated text word by word from left to right based on the first vector and the second vector comprises:
generating the translated editing text according to a text generation formula, wherein the text generation formula is as follows:
wherein x represents the first vector, m represents the second vector, y represents the translated edit text, and P (y|m, x) represents a conditional probability of generating the translated edit text; the conditional probability of any word generation in the translated editing text is as follows: p (y) t |y <t ,m,x)=Softmax(W o ·z t +b o ),y t Representing words generated at time t, W o And b o Generating parameters, Z t Indicating the output result after passing through the network layer.
Wherein after the generating the translated edited text of the machine translated text word by word from left to right according to the first vector and the second vector, further comprises:
calculating a cross entropy loss function value of the translated editing text and a standard translation text of the source text;
judging whether the cross entropy loss function value is smaller than a preset threshold value or not;
if not, updating the generation parameters according to the cross entropy loss function value, carrying the updated generation parameters, and executing the step of generating the translated edited text of the machine translation text word by word from left to right according to the first vector and the second vector.
Wherein said calculating a cross entropy loss function value for the translated compiled text and a standard translated text of the source text comprises:
acquiring the standard translation text, and extracting a third text feature of the standard translation text through a masked self-attention mechanism;
optimizing the third text feature by using an attention mechanism on the first vector and optimizing the third text feature a second time by using an attention mechanism on the second vector;
processing the third text feature after the second optimization by using a feedforward neural network to obtain a third vector representing the standard translation text;
and vectorizing the translated edited text into a fourth vector, and calculating a cross entropy loss function value of the fourth vector and the third vector.
A post-translation editing device for machine-translating text, comprising:
the acquisition module is used for acquiring a source text and a machine translation text of the source text;
the first processing module is used for extracting first text features of the source text through a self-attention mechanism, and processing the first text features by utilizing a feedforward neural network to obtain a first vector representing the source text;
a second processing module for extracting a second text feature of the machine translated text by a self-attention mechanism, the second text feature being optimized by using an attention mechanism on the first vector; processing the optimized second text feature by using a feedforward neural network to obtain a second vector representing the machine translation text;
and the generation module is used for generating the translated editing text of the machine translation text word by word from left to right according to the first vector and the second vector.
Wherein, still include:
the calculation module is used for calculating a cross entropy loss function value of the translated editing text and the standard translation text of the source text;
the judging module is used for judging whether the cross entropy loss function value is smaller than a preset threshold value or not;
and the execution module is used for updating the generation parameters according to the cross entropy loss function value when the cross entropy loss function value is not smaller than a preset threshold value, and executing the steps in the generation module with the updated generation parameters.
A post-translation editing device for machine-translating text, comprising:
a memory for storing a computer program;
and a processor for implementing the method for post-translation editing of machine-translated text according to any one of the above steps when executing the computer program.
A readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the post-translation editing method of machine translated text of any one of the above.
As can be seen from the above solution, the post-translation editing method for machine translation text provided by the embodiment of the present invention includes: acquiring a source text and a machine translation text of the source text; extracting first text features of the source text through a self-attention mechanism, and processing the first text features by utilizing a feedforward neural network to obtain a first vector representing the source text; extracting a second text feature of the machine translated text by a self-attention mechanism, the second text feature being optimized by using an attention mechanism on the first vector; processing the optimized second text feature by using a feedforward neural network to obtain a second vector representing the machine translation text; and generating translated edited text of the machine translation text word by word from left to right according to the first vector and the second vector.
Therefore, the method extracts the text characteristics of the source text and the machine translation text through a self-attention mechanism, and can capture the internal structures of the source text and the machine translation text, so that the extracted text characteristics are more specific and fine, and the accuracy of the post-translation editing of the machine translation text can be improved; meanwhile, the second text characteristic of the machine translation text is optimized by using an attention mechanism on the first vector of the source text, so that the characteristics between the source text and the machine translation text are associated, and the generalization capability of post-translation editing can be improved; the feedforward neural network can combine the characterization information of different positions, so that the information characterization capability of sentences is further improved. Therefore, the method can improve the processing efficiency and accuracy of the post-translation editing, so that the accuracy of the post-translation editing text obtained by processing is better.
Correspondingly, the post-translation editing device and the post-translation editing equipment for the machine translation text, and the readable storage medium have the technical effects.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for post-translation editing of machine translated text according to an embodiment of the present invention;
FIG. 2 is a flow chart of another method for post-translational editing of machine translated text according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a post-translation editing apparatus for machine translation text according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a post-translation editing device for machine translation text according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a post-translation editing network model framework according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment of the invention discloses a method, a device and equipment for post-translation editing of a machine translation text and a readable storage medium, which are used for improving the accuracy of post-translation editing.
Referring to fig. 1, a post-translation editing method for machine translation text provided by an embodiment of the present invention includes:
s101, acquiring a source text and a machine translation text of the source text;
specifically, the machine translation text of the source text is the text obtained after the machine translation of the source text.
S102, extracting first text features of a source text through a self-attention mechanism, and processing the first text features by utilizing a feedforward neural network to obtain a first vector representing the source text;
s103, extracting second text features of the machine translation text through a self-attention mechanism, and optimizing the second text features by using the attention mechanism on the first vector; processing the optimized second text features by using a feedforward neural network to obtain a second vector representing the machine translation text;
s104, generating translated editing text of the machine translation text word by word from left to right according to the first vector and the second vector.
It should be noted that the attentive mechanism mimics the internal process of biological observation behavior, i.e., a mechanism that aligns internal experience with external sensations to increase the observation fineness of a partial region. Attention mechanisms can quickly extract important features of sparse data and are thus widely used for natural language processing tasks. The self-attention mechanism is able to learn the dependency between different locations inside the sentence itself.
The attention mechanism is generally used for processing machine translation tasks, and in the application, the attention mechanism is used for processing post-translation editing processing tasks, and the self-attention mechanism is combined to grasp text features of source texts and machine translation texts, so that specific and fine text features can be extracted, and the processing efficiency of post-translation editing can be improved.
It can be seen that the present embodiment provides a post-translation editing method for a machine-translated text, where the method extracts text features of a source text and a machine-translated text by a self-attention mechanism, and can capture internal structures of the source text and the machine-translated text, so that the extracted text features are more specific and fine, and thus the accuracy of post-translation editing of the machine-translated text can be improved; meanwhile, the second text characteristic of the machine translation text is optimized by using an attention mechanism on the first vector of the source text, so that the characteristics between the source text and the machine translation text are associated, and the generalization capability of post-translation editing can be improved; the feedforward neural network can combine the characterization information of different positions, so that the information characterization capability of sentences is further improved. Therefore, the method can improve the processing efficiency and accuracy of the post-translation editing, so that the accuracy of the post-translation editing text obtained by processing is better.
The embodiment of the invention discloses another post-translation editing method of machine translation text, and compared with the previous embodiment, the embodiment further describes and optimizes the technical scheme.
Referring to fig. 2, another method for post-translation editing of machine-translated text according to an embodiment of the present invention includes:
s201, acquiring a source text and a machine translation text of the source text;
s202, extracting first text features of a source text through a self-attention mechanism, and processing the first text features by utilizing a feedforward neural network to obtain a first vector representing the source text;
s203, extracting second text features of the machine translation text through a self-attention mechanism, and optimizing the second text features by using the attention mechanism on the first vector; processing the optimized second text features by using a feedforward neural network to obtain a second vector representing the machine translation text;
s204, generating a translated editing text of the machine translation text word by word from left to right according to the first vector and the second vector;
205. calculating a cross entropy loss function value of the standard translation text of the translated editing text and the source text;
specifically, the standard translation text of the source text is: and after the machine translation text obtained by machine translation of the source text is compiled, obtaining the final text conforming to the human language style. The calculation of the cross entropy loss function value for the translated compiled text and the standard translated text can be understood as: and judging the similarity between the translated editing text and the standard translation text.
When the cross entropy loss function value of the translated editing text and the standard translation text is larger, the similarity between the translated editing text and the standard translation text is smaller, and the translated editing text and the standard translation text can be considered to be different, and further optimization and processing are needed for the translated editing text; when the cross entropy loss function value of the translated editing text and the standard translation text is smaller, the greater the similarity between the translated editing text and the standard translation text is, the two can be considered as the same to a certain extent.
The embodiment considers the loss function of sentence level, and can provide better optimization basis for the generation of the compiled text.
206. Judging whether the cross entropy loss function value is smaller than a preset threshold value or not; if yes, executing S208; if not, then S207 is performed;
s207, updating the generation parameters according to the cross entropy loss function values, and carrying out S204 with the updated generation parameters;
specifically, the loss of the translated edit text can be considered as the difference between the translated edit text and the standard translation text, and is generally expressed by the edit distance of the two texts. The smaller the edit distance of the two texts, the more similar the two texts are indicated.
S208, determining the generated compiled text after translation as a standard translation result of the machine translation text.
Wherein said calculating a cross entropy loss function value for the translated compiled text and a standard translated text of the source text comprises:
acquiring the standard translation text, and extracting a third text feature of the standard translation text through a masked self-attention mechanism;
optimizing the third text feature by using an attention mechanism on the first vector and optimizing the third text feature a second time by using an attention mechanism on the second vector;
processing the third text feature after the second optimization by using a feedforward neural network to obtain a third vector representing the standard translation text;
and vectorizing the translated edited text into a fourth vector, and calculating a cross entropy loss function value of the fourth vector and the third vector.
It can be seen that the present embodiment provides another post-translation editing method of machine-translated text, where the method extracts text features of a source text and a machine-translated text by a self-attention mechanism, and can capture internal structures of the source text and the machine-translated text, so that the extracted text features are more specific and fine, and thus the accuracy of post-translation editing of the machine-translated text can be improved; meanwhile, the second text characteristic of the machine translation text is optimized by using an attention mechanism on the first vector of the source text, so that the characteristics between the source text and the machine translation text are associated, and the generalization capability of post-translation editing can be improved; the feedforward neural network can combine the characterization information of different positions, so that the information characterization capability of sentences is further improved. Therefore, the method can improve the processing efficiency and accuracy of the post-translation editing, so that the accuracy of the post-translation editing text obtained by processing is better.
Based on any of the foregoing embodiments, it should be noted that the extracting, by a self-attention mechanism, the first text feature of the source text, and processing the first text feature by using a feedforward neural network, to obtain a first vector representing the source text includes:
processing the source text through a residual neural network to obtain the first vector;
wherein each network layer in the residual neural network is comprised of a self-attention machine sublayer and a feed-forward neural network sublayer.
Based on any of the above embodiments, it should be noted that optimizing the second text feature by using an attention mechanism on the first vector includes:
optimizing the second text feature according to an attention mechanism processing formula, wherein the attention mechanism processing formula is as follows:
wherein Q represents a query term in the second text feature; k, V represents a pair of key values.
Based on any of the above embodiments, it should be noted that the generating the translated compiled text of the machine-translated text word by word from left to right according to the first vector and the second vector includes:
generating the translated editing text according to a text generation formula, wherein the text generation formula is as follows:
wherein x represents the first vector, m represents the second vector, y represents the translated edit text, and P (y|m, x) represents a conditional probability of generating the translated edit text; the conditional probability of any word generation in the translated editing text is as follows: p (y) t |y <t ,m,x)=Softmax(W o ·z t +b o ),y t Representing words generated at time t, W o And b o Generating parameters, Z t Indicating the output result after passing through the network layer.
If the post-translation editing method provided by the invention is used for constructing the post-translation editing processing model, the network layer is the last layer of the whole post-translation editing processing model.
The following describes a post-translation editing device for machine-translated text according to an embodiment of the present invention, and the post-translation editing device for machine-translated text and the post-translation editing method for machine-translated text described above may be referred to with each other.
Referring to fig. 3, a post-translation editing device for machine translation text provided in an embodiment of the present invention includes:
an obtaining module 301, configured to obtain a source text and a machine translation text of the source text;
a first processing module 302, configured to extract a first text feature of the source text through a self-attention mechanism, and process the first text feature by using a feedforward neural network to obtain a first vector representing the source text;
a second processing module 303 for extracting second text features of the machine translated text by a self-attention mechanism, the second text features being optimized by using an attention mechanism on the first vector; processing the optimized second text feature by using a feedforward neural network to obtain a second vector representing the machine translation text;
a generating module 304, configured to generate a translated edit text of the machine translated text word by word from left to right according to the first vector and the second vector.
Wherein, still include:
the calculation module is used for calculating a cross entropy loss function value of the translated editing text and the standard translation text of the source text;
the judging module is used for judging whether the cross entropy loss function value is smaller than a preset threshold value or not;
and the execution module is used for updating the generation parameters according to the cross entropy loss function value when the cross entropy loss function value is not smaller than a preset threshold value, and executing the steps in the generation module with the updated generation parameters.
Wherein the computing module comprises:
an obtaining unit, configured to obtain the standard translation text, and extract a third text feature of the standard translation text through a masked self-attention mechanism;
a first optimizing unit configured to optimize the third text feature by using an attention mechanism for the first vector, and to optimize the third text feature for a second time by using an attention mechanism for the second vector;
the second optimizing unit is used for processing the third text characteristic after the second optimization by utilizing a feedforward neural network to obtain a third vector representing the standard translation text;
and the calculating unit is used for vectorizing the translated edited text into a fourth vector and calculating a cross entropy loss function value of the fourth vector and the third vector.
The first processing module is specifically configured to:
processing the source text through a residual neural network to obtain the first vector;
wherein each network layer in the residual neural network is comprised of a self-attention machine sublayer and a feed-forward neural network sublayer.
The second processing module is specifically configured to:
optimizing the second text feature according to an attention mechanism processing formula, wherein the attention mechanism processing formula is as follows:
wherein Q represents a query term in the second text feature; k, V represents a pair of key values.
The generating module is specifically configured to:
generating the translated editing text according to a text generation formula, wherein the text generation formula is as follows:
wherein x represents the first vector, m represents the second vector, y represents the translated edit text, and P (y|m, x) represents a conditional probability of generating the translated edit text; any word in the translated edit text is generatedThe conditional probabilities of the formation are: p (y) t |y <t ,m,x)=Softmax(W o ·z t +b o ),y t Representing words generated at time t, W o And b o Generating parameters, Z t Indicating the output result after passing through the network layer.
It can be seen that this embodiment provides a post-translation editing apparatus for machine-translating text, comprising: the device comprises an acquisition module, a first processing module, a second processing module and a generation module. Firstly, acquiring a source text and a machine translation text of the source text by an acquisition module; then a first processing module extracts first text features of the source text through a self-attention mechanism, and processes the first text features by utilizing a feedforward neural network to obtain a first vector representing the source text; and then a second processing module extracts a second text feature of the machine translated text through a self-attention mechanism, the second text feature being optimized by using an attention mechanism on the first vector; processing the optimized second text feature by using a feedforward neural network to obtain a second vector representing the machine translation text; and finally, generating the translated editing text of the machine translation text word by word from left to right according to the first vector and the second vector by a generating module. Therefore, the modules work separately and cooperate, and the processing efficiency and the accuracy of the translated editing are improved, so that the accuracy of the translated editing text obtained by processing is better.
The following describes a post-translation editing device for machine translation text according to an embodiment of the present invention, and the post-translation editing device for machine translation text described below and the post-translation editing method and apparatus for machine translation text described above may be referred to each other.
Referring to fig. 4, a post-translation editing device for machine-translating text according to an embodiment of the present invention includes:
a memory 401 for storing a computer program;
a processor 402, configured to implement the steps of the post-translation editing method of machine translation text according to any of the above embodiments when executing the computer program.
The processor may be a Central Processing Unit (CPU) or a Graphics Processor (GPU). The GPU has good advantages when processing large-scale data.
The following describes a readable storage medium according to an embodiment of the present invention, and the following description of the readable storage medium and the method, apparatus and device for post-translation editing of machine-translated text described above may be referred to with each other.
A readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of a post-translation editing method of machine translated text as described in any of the embodiments above.
The post-translation editing method provided by the invention can construct a post-translation editing network model shown in fig. 5, wherein the model comprises a source text processing network, a machine translation text processing network and a standard translation text processing network, and the source text processing network, the machine translation text processing network and the standard translation text processing network are all residual networks.
The source text processing network is N layers, each consisting of a self-attention machine sublayer and a feed-forward neural network sublayer. The machine-translated text processing network is N layers, each layer consisting of a self-attention machine sub-layer, an attention machine sub-layer, and a feed forward neural network sub-layer, the attention machine sub-layer in the machine-translated text processing network representing the use of an attention mechanism on the source text. The standard translation text processing network is N layers, each layer is composed of a masked self-attention machine sublayer, an attention machine sublayer and a feedforward neural network sublayer, and the attention machine sublayer in the standard translation text processing network comprises: an attention mechanism is used for source text and an attention mechanism is used for machine translated text.
Wherein the attention mechanism calculates the dot product of the Query term Q (Query) to all key K (key) values by a mapping Query (Query) and a set of key-value pairs (key-values), and divides the dot productScaling and finally obtaining keys by using a softmax functionWeight distribution of the value K (values). The method can be specifically described by the following formula:
the multi-headed attention mechanism allows the model to jointly pay attention to information from different token subspaces at different locations, which can be formulated as follows:
MultiHead(Q,K,V)=Concat(head 1 ,...,head h )W o
wherehead i =Attention(QW i Q ,KW i K ,VW i V )
the feed forward neural network comprises two linear variations, between which a ReLu activation function is used, which can be expressed by the following formula:
FFN(x)=max(0,xW 1 +b 1 )W 2 +b 2
wherein W is 1 、W 2 、b 1 、b 2 Are trainable parameters.
The identifier in fig. 5 is a Discriminator in the compiled network model, and the Discriminator adopts a cyclic neural network, and selectively adopts a bidirectional (Gated Recurrent Unit, abbreviated as GRU) structure to represent sentences. The discriminator reads in the compiled text and the standard translation text, the two-way GRU is used for representing the word embedding of the two sentences to obtain the content vector, and the loss function is given to discriminate between the generated text and the reference text, so that the discrimination is more and more accurate.
The calculation formula for judging the cross entropy loss of the compiled text after translation and the standard translation by the judging device is as follows:
P(y,r)=sigmoid(W d ·||H y -H r ||+b d )
the loss function of the arbiter is expressed by the following formula:
L(H y ,H r )=-log(sigmoid(W d ·||H y -H r ||+b d ))
wherein I H y -H r The term "Euclidean distance" W "represents the Euclidean distance between the content vectors of the translated compiled text and the standard translated text d And b d Are trainable parameters.
When the judging result output by the judging device does not meet the preset output condition, calculating the loss of the translated editing text, and feeding back the loss to optimize the network parameters of the translated editing network model, so that more accurate translated editing text is generated.
Wherein the maximum expected value of the objective function for generating the translated editorial text is set to:
When training the discriminators in the compiled network model, the generator parameters are frozen and the loss function of the discriminators is minimized. Specifically, each time 4 epoch training generators are performed, one epoch training discriminator is used, and the training is iterated in sequence until the model generators and the discriminators are converged and then the training is stopped.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (9)
1. A method of post-translation editing of machine-translated text, comprising:
acquiring a source text and a machine translation text of the source text;
extracting first text features of the source text through a self-attention mechanism, and processing the first text features by utilizing a feedforward neural network to obtain a first vector representing the source text;
extracting a second text feature of the machine translated text by a self-attention mechanism, the second text feature being optimized by using an attention mechanism on the first vector; processing the optimized second text feature by using a feedforward neural network to obtain a second vector representing the machine translation text;
generating translated edited text of the machine translated text word by word from left to right according to the first vector and the second vector;
wherein the generating the translated compiled text of the machine translated text word by word from left to right based on the first vector and the second vector comprises:
generating the translated editing text according to a text generation formula, wherein the text generation formula is as follows:
wherein x represents the first vector, m represents the second vector, y represents the translated edit text, and P (y|m, x) represents a conditional probability of generating the translated edit text; the conditional probability of any word generation in the translated editing text is as follows: p (y) t |y <t ,m,x)=Softmax(W o ·z t +b o ),y t Representing words generated at time t, W o And b o Generating parameters, Z t Indicating the output result after passing through the network layer.
2. The method of post-translation editing of machine-translated text according to claim 1 wherein the extracting the first text feature of the source text by a self-attention mechanism and processing the first text feature using a feed-forward neural network to obtain a first vector representing the source text comprises:
processing the source text through a residual neural network to obtain the first vector;
wherein each network layer in the residual neural network is comprised of a self-attention machine sublayer and a feed-forward neural network sublayer.
3. The method of post-translation editing of machine-translated text according to claim 2, wherein said optimizing the second text feature by using an attention mechanism on the first vector comprises:
optimizing the second text feature according to an attention mechanism processing formula, wherein the attention mechanism processing formula is as follows:
wherein Q represents a query term in the second text feature; k, V represents a pair of key values.
4. A method of post-translational editing of machine-translated text according to any one of claims 1-3, wherein after said generating of post-translational edited text of said machine-translated text word by word from left to right based on said first vector and said second vector, further comprises:
calculating a cross entropy loss function value of the translated editing text and a standard translation text of the source text;
judging whether the cross entropy loss function value is smaller than a preset threshold value or not;
if not, updating the generation parameters according to the cross entropy loss function value, carrying the updated generation parameters, and executing the step of generating the translated edited text of the machine translation text word by word from left to right according to the first vector and the second vector.
5. The method of post-translational editing of machine-translated text of claim 4, wherein said calculating a cross-entropy loss function value for the post-translational edited text and standard translated text of the source text comprises:
acquiring the standard translation text, and extracting a third text feature of the standard translation text through a masked self-attention mechanism;
optimizing the third text feature by using an attention mechanism on the first vector and optimizing the third text feature a second time by using an attention mechanism on the second vector;
processing the third text feature after the second optimization by using a feedforward neural network to obtain a third vector representing the standard translation text;
and vectorizing the translated edited text into a fourth vector, and calculating a cross entropy loss function value of the fourth vector and the third vector.
6. A post-translation editing apparatus for machine-translating text, comprising:
the acquisition module is used for acquiring a source text and a machine translation text of the source text;
the first processing module is used for extracting first text features of the source text through a self-attention mechanism, and processing the first text features by utilizing a feedforward neural network to obtain a first vector representing the source text;
a second processing module for extracting a second text feature of the machine translated text by a self-attention mechanism, the second text feature being optimized by using an attention mechanism on the first vector; processing the optimized second text feature by using a feedforward neural network to obtain a second vector representing the machine translation text;
the generation module is used for generating translated editing text of the machine translation text word by word from left to right according to the first vector and the second vector;
the generating module is specifically configured to:
generating the translated editing text according to a text generation formula, wherein the text generation formula is as follows:
wherein x represents the first vector, m represents the second vector, y represents the translated edit text, and P (y|m, x) represents a conditional probability of generating the translated edit text; the conditional probability of any word generation in the translated editing text is as follows: p (y) t |y <t ,m,x)=Softmax(W o ·z t +b o ),y t Representing words generated at time t, W o And b o Generating parameters, Z t Indicating the output result after passing through the network layer.
7. The machine-translated text post-editing apparatus of claim 6, further comprising:
the calculation module is used for calculating a cross entropy loss function value of the translated editing text and the standard translation text of the source text;
the judging module is used for judging whether the cross entropy loss function value is smaller than a preset threshold value or not;
and the execution module is used for updating the generation parameters according to the cross entropy loss function value when the cross entropy loss function value is not smaller than a preset threshold value, and executing the steps in the generation module with the updated generation parameters.
8. A post-translation editing apparatus for machine-translating text, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the method for post-translational editing of machine translated text according to any one of claims 1 to 5 when executing said computer program.
9. A readable storage medium, characterized in that it has stored thereon a computer program which, when executed by a processor, implements the steps of the method for post-translational editing of machine translated text according to any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910079518.1A CN109635269B (en) | 2019-01-31 | 2019-01-31 | Post-translation editing method and device for machine translation text |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910079518.1A CN109635269B (en) | 2019-01-31 | 2019-01-31 | Post-translation editing method and device for machine translation text |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109635269A CN109635269A (en) | 2019-04-16 |
CN109635269B true CN109635269B (en) | 2023-06-16 |
Family
ID=66062387
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910079518.1A Active CN109635269B (en) | 2019-01-31 | 2019-01-31 | Post-translation editing method and device for machine translation text |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109635269B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110765791B (en) * | 2019-11-01 | 2021-04-06 | 清华大学 | Automatic post-editing method and device for machine translation |
CN110909527B (en) * | 2019-12-03 | 2023-12-08 | 北京字节跳动网络技术有限公司 | Text processing model running method and device, electronic equipment and storage medium |
CN116069901B (en) * | 2023-02-03 | 2023-08-11 | 上海一者信息科技有限公司 | Non-translated element identification method based on editing behavior and rule |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107301173A (en) * | 2017-06-22 | 2017-10-27 | 北京理工大学 | A kind of automatic post-editing system and method for multi-source neutral net that mode is remixed based on splicing |
CN107967262A (en) * | 2017-11-02 | 2018-04-27 | 内蒙古工业大学 | A kind of neutral net covers Chinese machine translation method |
CN108563640A (en) * | 2018-04-24 | 2018-09-21 | 中译语通科技股份有限公司 | A kind of multilingual pair of neural network machine interpretation method and system |
CN109241536A (en) * | 2018-09-21 | 2019-01-18 | 浙江大学 | It is a kind of based on deep learning from the sentence sort method of attention mechanism |
CN109271646A (en) * | 2018-09-04 | 2019-01-25 | 腾讯科技(深圳)有限公司 | Text interpretation method, device, readable storage medium storing program for executing and computer equipment |
-
2019
- 2019-01-31 CN CN201910079518.1A patent/CN109635269B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107301173A (en) * | 2017-06-22 | 2017-10-27 | 北京理工大学 | A kind of automatic post-editing system and method for multi-source neutral net that mode is remixed based on splicing |
CN107967262A (en) * | 2017-11-02 | 2018-04-27 | 内蒙古工业大学 | A kind of neutral net covers Chinese machine translation method |
CN108563640A (en) * | 2018-04-24 | 2018-09-21 | 中译语通科技股份有限公司 | A kind of multilingual pair of neural network machine interpretation method and system |
CN109271646A (en) * | 2018-09-04 | 2019-01-25 | 腾讯科技(深圳)有限公司 | Text interpretation method, device, readable storage medium storing program for executing and computer equipment |
CN109241536A (en) * | 2018-09-21 | 2019-01-18 | 浙江大学 | It is a kind of based on deep learning from the sentence sort method of attention mechanism |
Also Published As
Publication number | Publication date |
---|---|
CN109635269A (en) | 2019-04-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Xiang et al. | A convolutional neural network-based linguistic steganalysis for synonym substitution steganography | |
CN111460807B (en) | Sequence labeling method, device, computer equipment and storage medium | |
Zhou et al. | Deep semantic dictionary learning for multi-label image classification | |
CN108829677B (en) | Multi-modal attention-based automatic image title generation method | |
Su et al. | Learning visual knowledge memory networks for visual question answering | |
CN111859978A (en) | Emotion text generation method based on deep learning | |
CN109635269B (en) | Post-translation editing method and device for machine translation text | |
CN110619034A (en) | Text keyword generation method based on Transformer model | |
CN112732864B (en) | Document retrieval method based on dense pseudo query vector representation | |
CN111046178B (en) | Text sequence generation method and system | |
Zhang et al. | Bayesian attention belief networks | |
Liu et al. | Generating questions for knowledge bases via incorporating diversified contexts and answer-aware loss | |
CN113157919A (en) | Sentence text aspect level emotion classification method and system | |
US11615247B1 (en) | Labeling method and apparatus for named entity recognition of legal instrument | |
Phan-Vu et al. | Neural machine translation between Vietnamese and English: an empirical study | |
CN110298046B (en) | Translation model training method, text translation method and related device | |
CN111563148A (en) | Dialog generation method based on phrase diversity | |
CN115510230A (en) | Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism | |
Agarla et al. | Semi-supervised cross-lingual speech emotion recognition | |
CN115080736B (en) | Model adjustment method and device for discriminant language model | |
CN116226322A (en) | Mongolian emotion analysis method based on fusion of countermeasure learning and support vector machine | |
CN115169429A (en) | Lightweight aspect-level text emotion analysis method | |
CN110059314B (en) | Relation extraction method based on reinforcement learning | |
Kibria et al. | Context-driven bengali text generation using conditional language model | |
Zhao et al. | Biomedical Named Entity Recognition Through Deep Reinforcement Learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |