CN109635269B - Method and device for post-translation editing of machine-translated text - Google Patents
Method and device for post-translation editing of machine-translated text Download PDFInfo
- Publication number
- CN109635269B CN109635269B CN201910079518.1A CN201910079518A CN109635269B CN 109635269 B CN109635269 B CN 109635269B CN 201910079518 A CN201910079518 A CN 201910079518A CN 109635269 B CN109635269 B CN 109635269B
- Authority
- CN
- China
- Prior art keywords
- text
- translated
- vector
- machine
- post
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013519 translation Methods 0.000 title claims abstract description 153
- 238000000034 method Methods 0.000 title claims abstract description 57
- 239000013598 vector Substances 0.000 claims abstract description 114
- 230000007246 mechanism Effects 0.000 claims abstract description 83
- 238000012545 processing Methods 0.000 claims abstract description 60
- 238000013528 artificial neural network Methods 0.000 claims abstract description 48
- 230000006870 function Effects 0.000 claims description 40
- 238000004590 computer program Methods 0.000 claims description 11
- 238000005457 optimization Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000001323 posttranslational effect Effects 0.000 claims 6
- 230000008569 process Effects 0.000 abstract description 19
- 239000000284 extract Substances 0.000 abstract description 17
- 230000000694 effects Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 3
- 230000000306 recurrent effect Effects 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 230000002457 bidirectional effect Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
本发明公开了一种机器翻译文本的译后编辑方法,包括:获取源文本和机器翻译文本;通过自注意力机制提取源文本的第一文本特征,并利用前馈神经网络对第一文本特征进行处理,得到表示源文本的第一向量;通过自注意力机制提取机器翻译文本的第二文本特征,通过对第一向量使用注意力机制优化第二文本特征;利用前馈神经网络对优化后的第二文本特征进行处理,得到表示机器翻译文本的第二向量;根据第一向量和第二向量从左至右逐词生成机器翻译文本的译后编辑文本。该方法能够提高译后编辑的处理效率和准确率,使得处理得到的译后编辑文本的准确性更佳。本发明公开的一种机器翻译文本的译后编辑装置、设备及可读存储介质,也同样具有上述技术效果。
The invention discloses a method for post-translation editing of machine-translated texts, comprising: obtaining source texts and machine-translated texts; Process to obtain the first vector representing the source text; extract the second text feature of the machine translation text through the self-attention mechanism, optimize the second text feature by using the attention mechanism on the first vector; use the feed-forward neural network to optimize the The second text feature is processed to obtain a second vector representing the machine-translated text; and a post-translation edited text of the machine-translated text is generated word by word from left to right according to the first vector and the second vector. The method can improve the processing efficiency and accuracy of the post-translation editing, so that the accuracy of the processed post-editing text is better. A post-translation editing device, equipment, and readable storage medium for machine-translated texts disclosed in the present invention also have the above-mentioned technical effects.
Description
技术领域technical field
本发明涉及文本自动翻译技术领域,更具体地说,涉及一种机器翻译文本的译后编辑方法、装置、设备及可读存储介质。The present invention relates to the technical field of automatic text translation, and more specifically, to a post-translation editing method, device, equipment and readable storage medium for machine-translated text.
背景技术Background technique
机器翻译又称为自动翻译,是利用计算机把一种自然源语言转变为另一种自然目标语言的过程,一般指自然语言之间句子和全文的翻译。相应的,机器翻译文本指利用计算机对一种语言文本进行翻译,得到的另一种语言文本。译后编辑指对机器生成的翻译文本进行完善的过程,使得机器翻译文本更加符合人类语言风格。Machine translation, also known as automatic translation, is the process of using a computer to convert a natural source language into another natural target language, generally referring to the translation of sentences and full texts between natural languages. Correspondingly, a machine-translated text refers to a text in another language obtained by translating a text in one language using a computer. Post-editing refers to the process of improving the translated text generated by the machine, so that the machine-translated text is more in line with the human language style.
在现有技术中,一般基于循环神经网络实现译后编辑的自动处理。需要说明的是,循环神经网络提取到的语言文本的特征不够精细,其中利用对数线性组合来处理源文本和机器翻译文本,也无法关联源文本和机器翻译文本之间的特征,导致对源文本和机器翻译文本的表征能力不足,从而降低了译后编辑的准确率,使得译后编辑得到的译后编辑文本的准确性有所降低。其中,译后编辑文本即为对机器翻译文本进行译后编辑处理后,得到的文本。In the prior art, the automatic processing of post-translation editing is generally realized based on a recurrent neural network. It should be noted that the features of the language text extracted by the cyclic neural network are not fine enough, and the logarithmic linear combination is used to process the source text and the machine translation text, and the features between the source text and the machine translation text cannot be correlated, resulting in the Insufficient representation capabilities of text and machine-translated texts reduce the accuracy of post-editing and reduce the accuracy of post-edited texts obtained by post-editing. Wherein, the post-edited text is the text obtained after post-editing the machine-translated text.
因此,如何提高译后编辑的准确率,是本领域技术人员需要解决的问题。Therefore, how to improve the accuracy of post-translation editing is a problem to be solved by those skilled in the art.
发明内容Contents of the invention
本发明的目的在于提供一种机器翻译文本的译后编辑方法、装置、设备及可读存储介质,以提高译后编辑的准确率。The purpose of the present invention is to provide a post-translation editing method, device, equipment and readable storage medium for machine-translated texts, so as to improve the accuracy of post-translation editing.
为实现上述目的,本发明实施例提供了如下技术方案:In order to achieve the above object, the embodiment of the present invention provides the following technical solutions:
一种机器翻译文本的译后编辑方法,包括:A method for post-translation editing of a machine-translated text, comprising:
获取源文本和所述源文本的机器翻译文本;obtaining source texts and machine translated texts of said source texts;
通过自注意力机制提取所述源文本的第一文本特征,并利用前馈神经网络对所述第一文本特征进行处理,得到表示所述源文本的第一向量;extracting the first text features of the source text through a self-attention mechanism, and processing the first text features using a feed-forward neural network to obtain a first vector representing the source text;
通过自注意力机制提取所述机器翻译文本的第二文本特征,通过对所述第一向量使用注意力机制优化所述第二文本特征;利用前馈神经网络对优化后的所述第二文本特征进行处理,得到表示所述机器翻译文本的第二向量;Extract the second text feature of the machine translation text through the self-attention mechanism, optimize the second text feature by using the attention mechanism on the first vector; utilize the feed-forward neural network to optimize the second text features are processed to obtain a second vector representing the machine-translated text;
根据所述第一向量和所述第二向量从左至右逐词生成所述机器翻译文本的译后编辑文本。A post-translation edited text of the machine-translated text is generated word by word from left to right according to the first vector and the second vector.
其中,所述通过自注意力机制提取所述源文本的第一文本特征,并利用前馈神经网络对所述第一文本特征进行处理,得到表示所述源文本的第一向量,包括:Wherein, the first text feature of the source text is extracted through a self-attention mechanism, and the first text feature is processed by a feed-forward neural network to obtain a first vector representing the source text, including:
通过残差神经网络处理所述源文本,得到所述第一向量;processing the source text through a residual neural network to obtain the first vector;
其中,所述残差神经网络中的每个网络层由自注意力机制子层和前馈神经网络子层构成。Wherein, each network layer in the residual neural network is composed of a self-attention mechanism sublayer and a feedforward neural network sublayer.
其中,所述通过对所述第一向量使用注意力机制优化所述第二文本特征,包括:Wherein, the optimization of the second text feature by using the attention mechanism on the first vector includes:
按照注意力机制处理公式优化所述第二文本特征,所述注意力机制处理公式为:Optimize the second text feature according to the attention mechanism processing formula, and the attention mechanism processing formula is:
其中,Q表示所述第二文本特征中的查询项;K,V表示一对键值。Wherein, Q represents a query item in the second text feature; K, V represent a pair of key values.
其中,所述根据所述第一向量和所述第二向量从左至右逐词生成所述机器翻译文本的译后编辑文本,包括:Wherein, said generating the post-translation edited text of the machine-translated text from left to right word by word according to the first vector and the second vector includes:
按照文本生成公式生成所述译后编辑文本,所述文本生成公式为:Generate the post-translation editing text according to the text generation formula, and the text generation formula is:
其中,x表示所述第一向量,m表示所述第二向量,y表示所述译后编辑文本,P(y|m,x)表示生成所述译后编辑文本的条件概率;所述译后编辑文本中的任意一个单词生成的条件概率为:P(yt|y<t,m,x)=Softmax(Wo·zt+bo),yt表示t时刻生成的单词,Wo和bo为生成参数,Zt表示经过网络层后的输出结果。Wherein, x represents the first vector, m represents the second vector, y represents the post-edited text, P(y|m, x) represents the conditional probability of generating the post-translated edited text; the translated The conditional probability generated by any word in the edited text is: P(y t |y <t ,m,x)=Softmax(W o z t +b o ), y t represents the word generated at time t, W o and b o are generation parameters, and Z t represents the output result after passing through the network layer.
其中,所述根据所述第一向量和所述第二向量从左至右逐词生成所述机器翻译文本的译后编辑文本之后,还包括:Wherein, after generating the post-translation edited text of the machine-translated text word by word according to the first vector and the second vector, it further includes:
计算所述译后编辑文本与所述源文本的标准翻译文本的交叉熵损失函数值;calculating the cross-entropy loss function value of the post-edited text and the standard translation text of the source text;
判断所述交叉熵损失函数值是否小于预设的阈值;Judging whether the cross-entropy loss function value is less than a preset threshold;
若否,则根据所述交叉熵损失函数值更新生成参数,携带更新后的生成参数执行所述根据所述第一向量和所述第二向量从左至右逐词生成所述机器翻译文本的译后编辑文本的步骤。If not, update the generation parameters according to the value of the cross-entropy loss function, carry the updated generation parameters and perform the step of generating the machine translation text from left to right word by word according to the first vector and the second vector Steps to post-edit text.
其中,所述计算所述译后编辑文本与所述源文本的标准翻译文本的交叉熵损失函数值,包括:Wherein, the calculation of the cross-entropy loss function value of the post-translation editing text and the standard translation text of the source text includes:
获取所述标准翻译文本,通过带掩码的自注意力机制提取所述标准翻译文本的第三文本特征;Obtain the standard translation text, and extract the third text feature of the standard translation text through a masked self-attention mechanism;
通过对所述第一向量使用注意力机制优化所述第三文本特征,并通过对所述第二向量使用注意力机制第二次优化所述第三文本特征;optimizing the third text feature by using an attention mechanism on the first vector, and optimizing the third text feature a second time by using an attention mechanism on the second vector;
利用前馈神经网络对第二次优化后的所述第三文本特征进行处理,得到表示所述标准翻译文本的第三向量;Using a feed-forward neural network to process the third text features after the second optimization to obtain a third vector representing the standard translation text;
将所述译后编辑文本向量化为第四向量,并计算所述第四向量与所述第三向量的交叉熵损失函数值。vectorizing the post-translated edited text into a fourth vector, and calculating a cross-entropy loss function value between the fourth vector and the third vector.
一种机器翻译文本的译后编辑装置,包括:A post-translation editing device for machine-translated text, comprising:
获取模块,用于获取源文本和所述源文本的机器翻译文本;an acquisition module, configured to acquire a source text and a machine-translated text of said source text;
第一处理模块,用于通过自注意力机制提取所述源文本的第一文本特征,并利用前馈神经网络对所述第一文本特征进行处理,得到表示所述源文本的第一向量;A first processing module, configured to extract first text features of the source text through a self-attention mechanism, and process the first text features using a feedforward neural network to obtain a first vector representing the source text;
第二处理模块,用于通过自注意力机制提取所述机器翻译文本的第二文本特征,通过对所述第一向量使用注意力机制优化所述第二文本特征;利用前馈神经网络对优化后的所述第二文本特征进行处理,得到表示所述机器翻译文本的第二向量;The second processing module is used to extract the second text feature of the machine translation text through the self-attention mechanism, optimize the second text feature by using the attention mechanism on the first vector; utilize the feed-forward neural network to optimize After processing the second text feature, obtain the second vector representing the machine translation text;
生成模块,用于根据所述第一向量和所述第二向量从左至右逐词生成所述机器翻译文本的译后编辑文本。A generation module, configured to generate a post-translation edited text of the machine-translated text word by word from left to right according to the first vector and the second vector.
其中,还包括:Among them, also include:
计算模块,用于计算所述译后编辑文本与所述源文本的标准翻译文本的交叉熵损失函数值;Calculation module, for calculating the cross-entropy loss function value of the standard translation text of the post-translation edited text and the source text;
判断模块,用于判断所述交叉熵损失函数值是否小于预设的阈值;A judging module, configured to judge whether the cross-entropy loss function value is less than a preset threshold;
执行模块,用于当所述交叉熵损失函数值不小于预设的阈值时,根据所述交叉熵损失函数值更新生成参数,携带更新后的生成参数执行所述生成模块中的步骤。An execution module, configured to update generation parameters according to the value of the cross-entropy loss function when the value of the cross-entropy loss function is not less than a preset threshold, and carry out the steps in the generation module with the updated generation parameters.
一种机器翻译文本的译后编辑设备,包括:A post-translation editing device for machine-translated text, comprising:
存储器,用于存储计算机程序;memory for storing computer programs;
处理器,用于执行所述计算机程序时实现上述任意一项所述的机器翻译文本的译后编辑方法的步骤。A processor, configured to implement the steps of the method for post-translation editing of a machine-translated text described in any one of the above when executing the computer program.
一种可读存储介质,所述可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现上述任意一项所述的机器翻译文本的译后编辑方法的步骤。A readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the method for post-translation editing of a machine-translated text described in any one of the above items are realized.
通过以上方案可知,本发明实施例提供的一种机器翻译文本的译后编辑方法,包括:获取源文本和所述源文本的机器翻译文本;通过自注意力机制提取所述源文本的第一文本特征,并利用前馈神经网络对所述第一文本特征进行处理,得到表示所述源文本的第一向量;通过自注意力机制提取所述机器翻译文本的第二文本特征,通过对所述第一向量使用注意力机制优化所述第二文本特征;利用前馈神经网络对优化后的所述第二文本特征进行处理,得到表示所述机器翻译文本的第二向量;根据所述第一向量和所述第二向量从左至右逐词生成所述机器翻译文本的译后编辑文本。From the above solutions, it can be known that a post-translation editing method of a machine-translated text provided by an embodiment of the present invention includes: obtaining the source text and the machine-translated text of the source text; extracting the first text of the source text through a self-attention mechanism Text features, and use the feedforward neural network to process the first text features to obtain the first vector representing the source text; extract the second text features of the machine translation text through the self-attention mechanism, and pass the The first vector uses the attention mechanism to optimize the second text feature; utilizes the feedforward neural network to process the optimized second text feature to obtain the second vector representing the machine translation text; according to the first A post-edited text of the machine-translated text is generated word by word from left to right by the first vector and the second vector.
可见,所述方法通过自注意力机制提取源文本和机器翻译文本的文本特征,能够捕获源文本和机器翻译文本的内部结构,使得提取出的文本特征更为具体和精细,从而可提高机器翻译文本的译后编辑的准确率;同时,通过对源文本的第一向量使用注意力机制优化机器翻译文本的第二文本特征,从而关联了源文本和机器翻译文本之间的特征,可提高译后编辑的泛化能力;前馈神经网络可以结合不同位置的表征信息,进一步提高对于句子的信息表征能力。因此该方法能够提高译后编辑的处理效率和准确率,使得处理得到的译后编辑文本的准确性更佳。It can be seen that the method extracts the text features of the source text and the machine translation text through the self-attention mechanism, and can capture the internal structure of the source text and the machine translation text, so that the extracted text features are more specific and refined, thereby improving the accuracy of machine translation. At the same time, by using the attention mechanism on the first vector of the source text to optimize the second text features of the machine-translated text, thereby associating the features between the source text and the machine-translated text, the translation can be improved. The generalization ability of post-editing; the feed-forward neural network can combine the representation information of different positions to further improve the information representation ability of sentences. Therefore, this method can improve the processing efficiency and accuracy of the post-translation editing, so that the accuracy of the post-translation editing text obtained through processing is better.
相应地,本发明实施例提供的一种机器翻译文本的译后编辑装置、设备及可读存储介质,也同样具有上述技术效果。Correspondingly, the post-translation editing device, equipment and readable storage medium for machine-translated texts provided by the embodiments of the present invention also have the above-mentioned technical effects.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without creative work.
图1为本发明实施例公开的一种机器翻译文本的译后编辑方法流程图;Fig. 1 is a flow chart of a method for post-translation editing of a machine-translated text disclosed in an embodiment of the present invention;
图2为本发明实施例公开的另一种机器翻译文本的译后编辑方法流程图;FIG. 2 is a flow chart of another method for post-translation editing of a machine-translated text disclosed in an embodiment of the present invention;
图3为本发明实施例公开的一种机器翻译文本的译后编辑装置示意图;Fig. 3 is a schematic diagram of a post-translation editing device for a machine-translated text disclosed in an embodiment of the present invention;
图4为本发明实施例公开的一种机器翻译文本的译后编辑设备示意图;FIG. 4 is a schematic diagram of a post-translation editing device for a machine-translated text disclosed in an embodiment of the present invention;
图5为本发明实施例公开的一种译后编辑网络模型框架示意图。Fig. 5 is a schematic diagram of a post-translation editing network model framework disclosed by an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.
本发明实施例公开了一种机器翻译文本的译后编辑方法、装置、设备及可读存储介质,以提高译后编辑的准确率。The embodiment of the invention discloses a method, device, equipment and readable storage medium for post-translation editing of machine-translated texts, so as to improve the accuracy rate of post-translation editing.
参见图1,本发明实施例提供的一种机器翻译文本的译后编辑方法,包括:Referring to Fig. 1, a kind of post-translation editing method of machine translation text provided by the embodiment of the present invention comprises:
S101、获取源文本和源文本的机器翻译文本;S101. Obtain the source text and the machine translation text of the source text;
具体的,源文本的机器翻译文本即为对源文本进行机器翻译后,得到的文本。Specifically, the machine-translated text of the source text is the text obtained after machine-translating the source text.
S102、通过自注意力机制提取源文本的第一文本特征,并利用前馈神经网络对第一文本特征进行处理,得到表示源文本的第一向量;S102. Extract the first text feature of the source text through a self-attention mechanism, and process the first text feature by using a feed-forward neural network to obtain a first vector representing the source text;
S103、通过自注意力机制提取机器翻译文本的第二文本特征,通过对第一向量使用注意力机制优化第二文本特征;利用前馈神经网络对优化后的第二文本特征进行处理,得到表示机器翻译文本的第二向量;S103. Extract the second text feature of the machine-translated text through the self-attention mechanism, optimize the second text feature by using the attention mechanism on the first vector; use the feedforward neural network to process the optimized second text feature, and obtain the representation the second vector of machine translated text;
S104、根据第一向量和第二向量从左至右逐词生成机器翻译文本的译后编辑文本。S104. Generate a post-translation edited text of the machine-translated text word by word from left to right according to the first vector and the second vector.
需要说明的是,注意力机制模仿了生物观察行为的内部过程,即一种将内部经验和外部感觉对齐从而增加部分区域的观察精细度的机制。注意力机制可以快速提取稀疏数据的重要特征,因而被广泛用于自然语言处理任务。自注意力机制能够学习句子自身内部不同位置之间的依赖关系。It should be noted that the attention mechanism imitates the internal process of biological observation behavior, that is, a mechanism that aligns internal experience with external perception to increase the fineness of observation in some areas. Attention mechanism can quickly extract important features of sparse data, so it is widely used in natural language processing tasks. The self-attention mechanism is able to learn the dependencies between different positions within the sentence itself.
注意力机制一般用于处理机器翻译任务,而在本申请中,将注意力机制用于处理译后编辑处理任务,并结合自注意力机制抓取源文本和机器翻译文本的文本特征,不仅能够提取出具体且精细文本特征,而且可以提高译后编辑的处理效率。The attention mechanism is generally used to process machine translation tasks, but in this application, the attention mechanism is used to process post-translation editing tasks, and combined with the self-attention mechanism to capture the text features of the source text and the machine translation text, not only can Extract specific and fine text features, and can improve the processing efficiency of post-translation editing.
可见,本实施例提供了一种机器翻译文本的译后编辑方法,所述方法通过自注意力机制提取源文本和机器翻译文本的文本特征,能够捕获源文本和机器翻译文本的内部结构,使得提取出的文本特征更为具体和精细,从而可提高机器翻译文本的译后编辑的准确率;同时,通过对源文本的第一向量使用注意力机制优化机器翻译文本的第二文本特征,从而关联了源文本和机器翻译文本之间的特征,可提高译后编辑的泛化能力;前馈神经网络可以结合不同位置的表征信息,进一步提高对于句子的信息表征能力。因此该方法能够提高译后编辑的处理效率和准确率,使得处理得到的译后编辑文本的准确性更佳。It can be seen that this embodiment provides a method for post-translation editing of machine-translated texts. The method extracts the text features of the source text and the machine-translated text through a self-attention mechanism, and can capture the internal structure of the source text and the machine-translated text, so that The extracted text features are more specific and refined, which can improve the accuracy of post-editing of machine translation texts; at the same time, by using the attention mechanism on the first vector of the source text to optimize the second text features of the machine translation text, thereby By associating the features between the source text and the machine-translated text, the generalization ability of post-translation editing can be improved; the feed-forward neural network can combine the representation information of different positions to further improve the information representation ability of sentences. Therefore, this method can improve the processing efficiency and accuracy of the post-translation editing, so that the accuracy of the post-translation editing text obtained through processing is better.
本发明实施例公开了另一种机器翻译文本的译后编辑方法,相对于上一实施例,本实施例对技术方案作了进一步的说明和优化。The embodiment of the present invention discloses another method for post-translation editing of machine-translated texts. Compared with the previous embodiment, this embodiment further explains and optimizes the technical solution.
参见图2,本发明实施例提供的另一种机器翻译文本的译后编辑方法,包括:Referring to FIG. 2 , another post-translation editing method for machine-translated text provided by an embodiment of the present invention includes:
S201、获取源文本和源文本的机器翻译文本;S201. Obtain the source text and the machine translation text of the source text;
S202、通过自注意力机制提取源文本的第一文本特征,并利用前馈神经网络对第一文本特征进行处理,得到表示源文本的第一向量;S202. Extract the first text features of the source text through a self-attention mechanism, and process the first text features by using a feed-forward neural network to obtain a first vector representing the source text;
S203、通过自注意力机制提取机器翻译文本的第二文本特征,通过对第一向量使用注意力机制优化第二文本特征;利用前馈神经网络对优化后的第二文本特征进行处理,得到表示机器翻译文本的第二向量;S203. Extract the second text feature of the machine-translated text through the self-attention mechanism, optimize the second text feature by using the attention mechanism on the first vector; use the feedforward neural network to process the optimized second text feature, and obtain the representation the second vector of machine translated text;
S204、根据第一向量和第二向量从左至右逐词生成机器翻译文本的译后编辑文本;S204. Generate a post-translation edited text of the machine-translated text word by word from left to right according to the first vector and the second vector;
205、计算译后编辑文本与源文本的标准翻译文本的交叉熵损失函数值;205. Calculating the cross-entropy loss function value of the post-translation edited text and the standard translation text of the source text;
具体的,源文本的标准翻译文本为:对源文本进行机器翻译得到的机器翻译文本进行译后编辑后,得到的符合人类语言风格的最终文本。计算译后编辑文本与标准翻译文本的交叉熵损失函数值可以理解为:判断译后编辑文本与标准翻译文本的相似度。Specifically, the standard translation text of the source text is: after post-editing the machine translation text obtained by machine translation of the source text, the final text conforming to the human language style is obtained. Calculating the cross-entropy loss function value of the post-edited text and the standard translation text can be understood as: judging the similarity between the post-edited text and the standard translation text.
当译后编辑文本与标准翻译文本的交叉熵损失函数值较大时,表明译后编辑文本与标准翻译文本的相似度越小,可认为二者不相同,译后编辑文本还需要进一步优化和处理;当译后编辑文本与标准翻译文本的交叉熵损失函数值较小时,表明译后编辑文本与标准翻译文本的相似度越大,在一定程度上可以认为二者是相同的。When the cross-entropy loss function value of the post-edited text and the standard translated text is larger, it indicates that the similarity between the post-edited text and the standard translated text is smaller, and it can be considered that the two are not the same, and the post-edited text needs further optimization and optimization. Processing; when the cross-entropy loss function value of the post-edited text and the standard translation text is smaller, it indicates that the similarity between the post-edited text and the standard translation text is greater, and the two can be considered to be the same to a certain extent.
本实施例将句子级别的损失函数考虑进来,能够为译后编辑文本的生成提供更好的优化依据。In this embodiment, the sentence-level loss function is taken into consideration, which can provide a better optimization basis for the generation of post-translation edited text.
206、判断交叉熵损失函数值是否小于预设的阈值;若是,则执行S208;若否,则执行S207;206. Determine whether the value of the cross-entropy loss function is smaller than a preset threshold; if yes, execute S208; if not, execute S207;
S207、根据所述交叉熵损失函数值更新生成参数,携带更新后的生成参数执行S204;S207. Update the generation parameters according to the value of the cross-entropy loss function, and execute S204 with the updated generation parameters;
具体的,译后编辑文本的损失可以认为是译后编辑文本与标准翻译文本的差异,一般用两个文本的编辑距离来表示。若两个文本的编辑距离越小,则表明这两个文本越相似。Specifically, the loss of the post-edited text can be considered as the difference between the post-edited text and the standard translation, which is generally represented by the edit distance between the two texts. The smaller the edit distance between two texts, the more similar the two texts are.
S208、将生成的译后编辑文本确定为机器翻译文本的标准翻译结果。S208. Determine the generated post-edited text as a standard translation result of the machine translated text.
其中,所述计算所述译后编辑文本与所述源文本的标准翻译文本的交叉熵损失函数值,包括:Wherein, the calculation of the cross-entropy loss function value of the post-translation editing text and the standard translation text of the source text includes:
获取所述标准翻译文本,通过带掩码的自注意力机制提取所述标准翻译文本的第三文本特征;Obtain the standard translation text, and extract the third text feature of the standard translation text through a masked self-attention mechanism;
通过对所述第一向量使用注意力机制优化所述第三文本特征,并通过对所述第二向量使用注意力机制第二次优化所述第三文本特征;optimizing the third text feature by using an attention mechanism on the first vector, and optimizing the third text feature a second time by using an attention mechanism on the second vector;
利用前馈神经网络对第二次优化后的所述第三文本特征进行处理,得到表示所述标准翻译文本的第三向量;Using a feed-forward neural network to process the third text features after the second optimization to obtain a third vector representing the standard translation text;
将所述译后编辑文本向量化为第四向量,并计算所述第四向量与所述第三向量的交叉熵损失函数值。vectorizing the post-translated edited text into a fourth vector, and calculating a cross-entropy loss function value between the fourth vector and the third vector.
可见,本实施例提供了另一种机器翻译文本的译后编辑方法,所述方法通过自注意力机制提取源文本和机器翻译文本的文本特征,能够捕获源文本和机器翻译文本的内部结构,使得提取出的文本特征更为具体和精细,从而可提高机器翻译文本的译后编辑的准确率;同时,通过对源文本的第一向量使用注意力机制优化机器翻译文本的第二文本特征,从而关联了源文本和机器翻译文本之间的特征,可提高译后编辑的泛化能力;前馈神经网络可以结合不同位置的表征信息,进一步提高对于句子的信息表征能力。因此该方法能够提高译后编辑的处理效率和准确率,使得处理得到的译后编辑文本的准确性更佳。It can be seen that this embodiment provides another method for post-translation editing of machine-translated texts. The method extracts the text features of the source text and the machine-translated text through a self-attention mechanism, and can capture the internal structure of the source text and the machine-translated text. The extracted text features are more specific and refined, thereby improving the accuracy of post-editing of machine-translated texts; at the same time, optimizing the second text features of machine-translated texts by using the attention mechanism on the first vector of the source text, Thus, the features between the source text and the machine-translated text are associated, which can improve the generalization ability of post-translation editing; the feed-forward neural network can combine the representation information of different positions to further improve the information representation ability of sentences. Therefore, this method can improve the processing efficiency and accuracy of the post-translation editing, so that the accuracy of the post-translation editing text obtained through processing is better.
基于上述任意实施例,需要说明的是,所述通过自注意力机制提取所述源文本的第一文本特征,并利用前馈神经网络对所述第一文本特征进行处理,得到表示所述源文本的第一向量,包括:Based on any of the above-mentioned embodiments, it should be noted that the first text feature of the source text is extracted through the self-attention mechanism, and the first text feature is processed by a feed-forward neural network to obtain a representation of the source text. The first vector of text, consisting of:
通过残差神经网络处理所述源文本,得到所述第一向量;processing the source text through a residual neural network to obtain the first vector;
其中,所述残差神经网络中的每个网络层由自注意力机制子层和前馈神经网络子层构成。Wherein, each network layer in the residual neural network is composed of a self-attention mechanism sublayer and a feedforward neural network sublayer.
基于上述任意实施例,需要说明的是,所述通过对所述第一向量使用注意力机制优化所述第二文本特征,包括:Based on any of the above embodiments, it should be noted that optimizing the second text feature by using an attention mechanism on the first vector includes:
按照注意力机制处理公式优化所述第二文本特征,所述注意力机制处理公式为:Optimize the second text feature according to the attention mechanism processing formula, and the attention mechanism processing formula is:
其中,Q表示所述第二文本特征中的查询项;K,V表示一对键值。Wherein, Q represents a query item in the second text feature; K, V represent a pair of key values.
基于上述任意实施例,需要说明的是,所述根据所述第一向量和所述第二向量从左至右逐词生成所述机器翻译文本的译后编辑文本,包括:Based on any of the above embodiments, it should be noted that the post-translation edited text of the machine-translated text is generated word by word from left to right according to the first vector and the second vector, including:
按照文本生成公式生成所述译后编辑文本,所述文本生成公式为:Generate the post-translation editing text according to the text generation formula, and the text generation formula is:
其中,x表示所述第一向量,m表示所述第二向量,y表示所述译后编辑文本,P(y|m,x)表示生成所述译后编辑文本的条件概率;所述译后编辑文本中的任意一个单词生成的条件概率为:P(yt|y<t,m,x)=Softmax(Wo·zt+bo),yt表示t时刻生成的单词,Wo和bo为生成参数,Zt表示经过网络层后的输出结果。Wherein, x represents the first vector, m represents the second vector, y represents the post-edited text, P(y|m, x) represents the conditional probability of generating the post-translated edited text; the translated The conditional probability generated by any word in the edited text is: P(y t |y <t ,m,x)=Softmax(W o z t +b o ), y t represents the word generated at time t, W o and b o are generation parameters, and Z t represents the output result after passing through the network layer.
其中,若按照本发明提供的译后编辑方法构建译后编辑处理模型,则所述网络层即为整个译后编辑处理模型的最后层。Wherein, if the post-translation editing processing model is constructed according to the post-translation editing method provided by the present invention, the network layer is the last layer of the entire post-translation editing processing model.
下面对本发明实施例提供的一种机器翻译文本的译后编辑装置进行介绍,下文描述的一种机器翻译文本的译后编辑装置与上文描述的一种机器翻译文本的译后编辑方法可以相互参照。A post-translation editing device for machine-translated texts provided by an embodiment of the present invention is introduced below. The post-translation editing device for machine-translated texts described below and the post-translation editing method for machine-translated texts described above can be mutually refer to.
参见图3,本发明实施例提供的一种机器翻译文本的译后编辑装置,包括:Referring to FIG. 3 , a post-translation editing device for machine-translated text provided by an embodiment of the present invention includes:
获取模块301,用于获取源文本和所述源文本的机器翻译文本;An
第一处理模块302,用于通过自注意力机制提取所述源文本的第一文本特征,并利用前馈神经网络对所述第一文本特征进行处理,得到表示所述源文本的第一向量;The
第二处理模块303,用于通过自注意力机制提取所述机器翻译文本的第二文本特征,通过对所述第一向量使用注意力机制优化所述第二文本特征;利用前馈神经网络对优化后的所述第二文本特征进行处理,得到表示所述机器翻译文本的第二向量;The
生成模块304,用于根据所述第一向量和所述第二向量从左至右逐词生成所述机器翻译文本的译后编辑文本。A
其中,还包括:Among them, also include:
计算模块,用于计算所述译后编辑文本与所述源文本的标准翻译文本的交叉熵损失函数值;Calculation module, for calculating the cross-entropy loss function value of the standard translation text of the post-translation edited text and the source text;
判断模块,用于判断所述交叉熵损失函数值是否小于预设的阈值;A judging module, configured to judge whether the cross-entropy loss function value is less than a preset threshold;
执行模块,用于当所述交叉熵损失函数值不小于预设的阈值时,根据所述交叉熵损失函数值更新生成参数,携带更新后的生成参数执行所述生成模块中的步骤。An execution module, configured to update generation parameters according to the value of the cross-entropy loss function when the value of the cross-entropy loss function is not less than a preset threshold, and carry out the steps in the generation module with the updated generation parameters.
其中,所述计算模块包括:Wherein, the calculation module includes:
获取单元,用于获取所述标准翻译文本,通过带掩码的自注意力机制提取所述标准翻译文本的第三文本特征;An acquisition unit, configured to acquire the standard translation text, and extract the third text feature of the standard translation text through a masked self-attention mechanism;
第一优化单元,用于通过对所述第一向量使用注意力机制优化所述第三文本特征,并通过对所述第二向量使用注意力机制第二次优化所述第三文本特征;A first optimization unit, configured to optimize the third text feature by using an attention mechanism on the first vector, and optimize the third text feature for a second time by using an attention mechanism on the second vector;
第二优化单元,用于利用前馈神经网络对第二次优化后的所述第三文本特征进行处理,得到表示所述标准翻译文本的第三向量;The second optimization unit is configured to use a feed-forward neural network to process the second optimized third text features to obtain a third vector representing the standard translated text;
计算单元,用于将所述译后编辑文本向量化为第四向量,并计算所述第四向量与所述第三向量的交叉熵损失函数值。A calculation unit, configured to vectorize the post-translation edited text into a fourth vector, and calculate a cross-entropy loss function value between the fourth vector and the third vector.
其中,所述第一处理模块具体用于:Wherein, the first processing module is specifically used for:
通过残差神经网络处理所述源文本,得到所述第一向量;processing the source text through a residual neural network to obtain the first vector;
其中,所述残差神经网络中的每个网络层由自注意力机制子层和前馈神经网络子层构成。Wherein, each network layer in the residual neural network is composed of a self-attention mechanism sublayer and a feedforward neural network sublayer.
其中,所述第二处理模块具体用于:Wherein, the second processing module is specifically used for:
按照注意力机制处理公式优化所述第二文本特征,所述注意力机制处理公式为:Optimize the second text feature according to the attention mechanism processing formula, and the attention mechanism processing formula is:
其中,Q表示所述第二文本特征中的查询项;K,V表示一对键值。Wherein, Q represents a query item in the second text feature; K, V represent a pair of key values.
其中,所述生成模块具体用于:Wherein, the generating module is specifically used for:
按照文本生成公式生成所述译后编辑文本,所述文本生成公式为:Generate the post-translation editing text according to the text generation formula, and the text generation formula is:
其中,x表示所述第一向量,m表示所述第二向量,y表示所述译后编辑文本,P(y|m,x)表示生成所述译后编辑文本的条件概率;所述译后编辑文本中的任意一个单词生成的条件概率为:P(yt|y<t,m,x)=Softmax(Wo·zt+bo),yt表示t时刻生成的单词,Wo和bo为生成参数,Zt表示经过网络层后的输出结果。Wherein, x represents the first vector, m represents the second vector, y represents the post-edited text, P(y|m, x) represents the conditional probability of generating the post-translated edited text; the translated The conditional probability generated by any word in the edited text is: P(y t |y <t ,m,x)=Softmax(W o z t +b o ), y t represents the word generated at time t, W o and b o are generation parameters, and Z t represents the output result after passing through the network layer.
可见,本实施例提供了一种机器翻译文本的译后编辑装置,包括:获取模块、第一处理模块、第二处理模块以及生成模块。首先由获取模块获取源文本和所述源文本的机器翻译文本;然后第一处理模块通过自注意力机制提取所述源文本的第一文本特征,并利用前馈神经网络对所述第一文本特征进行处理,得到表示所述源文本的第一向量;进而第二处理模块通过自注意力机制提取所述机器翻译文本的第二文本特征,通过对所述第一向量使用注意力机制优化所述第二文本特征;利用前馈神经网络对优化后的所述第二文本特征进行处理,得到表示所述机器翻译文本的第二向量;最后生成模块根据所述第一向量和所述第二向量从左至右逐词生成所述机器翻译文本的译后编辑文本。如此各个模块之间分工合作,各司其职,从而提高了译后编辑的处理效率和准确率,使得处理得到的译后编辑文本的准确性更佳。It can be seen that this embodiment provides a post-translation editing device for machine-translated text, including: an acquisition module, a first processing module, a second processing module, and a generation module. First, the source text and the machine translation text of the source text are obtained by the acquisition module; then the first processing module extracts the first text feature of the source text through a self-attention mechanism, and uses a feed-forward neural network to analyze the first text Features are processed to obtain the first vector representing the source text; then the second processing module extracts the second text features of the machine translation text through the self-attention mechanism, and optimizes the first vector by using the attention mechanism on the first vector. The second text feature; use the feed-forward neural network to process the optimized second text feature to obtain the second vector representing the machine translation text; finally the generation module according to the first vector and the second The vector generates a post-edited text of the machine translated text word by word from left to right. In this way, the division of labor and cooperation among the various modules, each performing its own duties, thereby improving the processing efficiency and accuracy of post-translation editing, and making the post-translation editing text obtained after processing more accurate.
下面对本发明实施例提供的一种机器翻译文本的译后编辑设备进行介绍,下文描述的一种机器翻译文本的译后编辑设备与上文描述的一种机器翻译文本的译后编辑方法及装置可以相互参照。A post-translation editing device for machine-translated texts provided by an embodiment of the present invention is introduced below, a post-translation editing device for machine-translated texts described below and a post-translation editing method and device for machine-translated texts described above Can be cross-referenced.
参见图4,本发明实施例提供的一种机器翻译文本的译后编辑设备,包括:Referring to FIG. 4, a post-translation editing device for machine-translated text provided by an embodiment of the present invention includes:
存储器401,用于存储计算机程序;
处理器402,用于执行所述计算机程序时实现上述任意实施例所述的机器翻译文本的译后编辑方法的步骤。The
其中,处理器可以为中央处理器(CPU)或图形处理器(GPU)。GPU在处理大规模数据时,具有良好的优势。Wherein, the processor may be a central processing unit (CPU) or a graphics processing unit (GPU). GPU has a good advantage when dealing with large-scale data.
下面对本发明实施例提供的一种可读存储介质进行介绍,下文描述的一种可读存储介质与上文描述的一种机器翻译文本的译后编辑方法、装置及设备可以相互参照。The following is an introduction to a readable storage medium provided by an embodiment of the present invention. The readable storage medium described below and the post-translation editing method, device, and equipment for a machine-translated text described above may refer to each other.
一种可读存储介质,所述可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如上述任意实施例所述的机器翻译文本的译后编辑方法的步骤。A readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the method for post-translation editing of a machine-translated text as described in any of the above embodiments are realized.
按照本发明提供的译后编辑方法可构建如图5所示的译后编辑网络模型,该模型包括源文本处理网络,机器翻译文本处理网络和标准翻译文本处理网络,其中源文本处理网络,机器翻译文本处理网络和标准翻译文本处理网络均为残差网络。According to the post-translation editing method provided by the present invention, the post-translation editing network model as shown in Figure 5 can be constructed, and the model includes a source text processing network, a machine translation text processing network and a standard translation text processing network, wherein the source text processing network, the machine Both the translated text processing network and the standard translated text processing network are residual networks.
源文本处理网络为N层,每一层由自注意力机制子层和前馈神经网络子层构成。机器翻译文本处理网络为N层,每一层由自注意力机制子层、注意力机制子层和前馈神经网络子层构成,机器翻译文本处理网络中的注意力机制子层表示对源文本使用注意力机制。标准翻译文本处理网络为N层,每一层由带掩码的自注意力机制子层、注意力机制子层和前馈神经网络子层构成,标准翻译文本处理网络中的注意力机制子层包括:对源文本使用注意力机制和对机器翻译文本使用注意力机制。The source text processing network has N layers, and each layer consists of a self-attention mechanism sublayer and a feedforward neural network sublayer. The machine translation text processing network has N layers, and each layer is composed of a self-attention mechanism sublayer, an attention mechanism sublayer and a feedforward neural network sublayer. The attention mechanism sublayer in the machine translation text processing network represents the source text Use attention mechanism. The standard translation text processing network has N layers, and each layer is composed of a self-attention mechanism sublayer with a mask, an attention mechanism sublayer and a feedforward neural network sublayer, and the attention mechanism sublayer in the standard translation text processing network Including: using attention mechanism on source text and using attention mechanism on machine translated text.
其中,注意力机制通过一个映射查询(Query)和一组键值对(key-values)计算查询项Q(query)对所有键K(key)值的点乘后,除以进行缩放,最后运用一个softmax函数获得键值K(values)的权重分布。具体可以用如下公式描述:Among them, the attention mechanism calculates the dot product of the query item Q (query) on all key K (key) values through a mapping query (Query) and a set of key-value pairs (key-values), and divides by Scale, and finally use a softmax function to obtain the weight distribution of the key K (values). Specifically, it can be described by the following formula:
多头注意力机制允许模型联合注意来自不同表征子空间在不同位置的信息,可用如下公式表示:The multi-head attention mechanism allows the model to jointly pay attention to information from different representation subspaces in different positions, which can be expressed by the following formula:
MultiHead(Q,K,V)=Concat(head1,...,headh)Wo MultiHead(Q,K,V)=Concat(head 1 ,...,head h )W o
whereheadi=Attention(QWi Q,KWi K,VWi V)wherehead i =Attention(QW i Q ,KW i K ,VW i V )
其中, in,
前馈神经网络包含两个线性变化,线性变换之间用ReLu激活函数,其可以用以下公式表示:The feedforward neural network contains two linear changes, and the ReLu activation function is used between the linear transformations, which can be expressed by the following formula:
FFN(x)=max(0,xW1+b1)W2+b2 FFN(x)=max(0,xW 1 +b 1 )W 2 +b 2
其中,W1、W2、b1、b2均为可训练参数。Among them, W 1 , W 2 , b 1 , and b 2 are all trainable parameters.
图5中的Discriminator即为译后编辑网络模型中的判别器,该判别器采用的是采用循环神经网络,选择采用双向(Gated Recurrent Unit,简称GRU)结构来表征句子。判别器读入译后编辑文本和标准翻译文本,用双向的GRU来表征两个句子的词嵌入后得到内容向量,给定损失函数,损失函数目标是对生成文本与参考文本之间进行判别,使得判别的越来越准。The Discriminator in Figure 5 is the discriminator in the post-translation editing network model. The discriminator uses a recurrent neural network and chooses to use a bidirectional (Gated Recurrent Unit, GRU) structure to represent sentences. The discriminator reads the post-translation edited text and the standard translation text, and uses a bidirectional GRU to represent the word embedding of the two sentences to obtain the content vector. Given a loss function, the goal of the loss function is to distinguish between the generated text and the reference text. Make the judgment more and more accurate.
判别器判别译后编辑文本和标准翻译文本的交叉熵损失的计算公式为:The formula for calculating the cross-entropy loss of the discriminator to distinguish the post-edited text from the standard translated text is:
P(y,r)=sigmoid(Wd·||Hy-Hr||+bd)P(y,r)=sigmoid(W d ·||H y -H r ||+b d )
判别器的损失函数用如下公式表示:The loss function of the discriminator is expressed by the following formula:
L(Hy,Hr)=-log(sigmoid(Wd·||Hy-Hr||+bd))L(H y ,H r )=-log(sigmoid(W d ·||H y -H r ||+b d ))
其中,||Hy-Hr||表示译后编辑文本和标准翻译文本的内容向量之间的欧氏距离,Wd和bd均为可训练参数。where ||H y −H r || represents the Euclidean distance between the content vectors of the post-edited text and the standard translated text, and both W d and b d are trainable parameters.
当判别器输出的判别结果不满足预设的输出条件时,计算译后编辑文本的损失,并将该损失进行反馈,以优化译后编辑网络模型的网络参数,使生成更精准的译后编辑文本。When the discrimination result output by the discriminator does not meet the preset output conditions, the loss of the post-edited text is calculated, and the loss is fed back to optimize the network parameters of the post-translation editing network model to generate more accurate post-editing text.
其中,生成译后编辑文本的目标函数的最大化期望值设置为:Among them, the maximum expected value of the objective function for generating post-edited text is set as:
对生成的译后编辑文本进行采样,并计算梯度 Sample the generated post-edited text and compute the gradient
故生成器的参数更新函数为:/> so The parameter update function of the generator is: />
当训练译后编辑网络模型中的判别器时,冻结生成器参数,并最小化判别器的损失函数。具体的,每进行4个epoch的训练生成器,再使用一个epoch训练判别器,依次迭代训练,直到模型的生成器和判别器均收敛后停止训练。When training the discriminator in a post-editing network model, freeze the generator parameters and minimize the loss function of the discriminator. Specifically, every time the generator is trained for 4 epochs, the discriminator is trained using another epoch, and the training is iterative in turn until the model generator and the discriminator both converge and then stop the training.
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。Each embodiment in this specification is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same and similar parts of each embodiment can be referred to each other.
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the invention. Therefore, the present invention will not be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (9)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910079518.1A CN109635269B (en) | 2019-01-31 | 2019-01-31 | Method and device for post-translation editing of machine-translated text |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910079518.1A CN109635269B (en) | 2019-01-31 | 2019-01-31 | Method and device for post-translation editing of machine-translated text |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109635269A CN109635269A (en) | 2019-04-16 |
CN109635269B true CN109635269B (en) | 2023-06-16 |
Family
ID=66062387
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910079518.1A Active CN109635269B (en) | 2019-01-31 | 2019-01-31 | Method and device for post-translation editing of machine-translated text |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109635269B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110765791B (en) * | 2019-11-01 | 2021-04-06 | 清华大学 | Method and device for automatic post-editing of machine translation |
CN110909527B (en) * | 2019-12-03 | 2023-12-08 | 北京字节跳动网络技术有限公司 | Text processing model running method and device, electronic equipment and storage medium |
CN116069901B (en) * | 2023-02-03 | 2023-08-11 | 上海一者信息科技有限公司 | Non-translated element identification method based on editing behavior and rule |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107301173A (en) * | 2017-06-22 | 2017-10-27 | 北京理工大学 | A kind of automatic post-editing system and method for multi-source neutral net that mode is remixed based on splicing |
CN107967262A (en) * | 2017-11-02 | 2018-04-27 | 内蒙古工业大学 | A kind of neutral net covers Chinese machine translation method |
CN108563640A (en) * | 2018-04-24 | 2018-09-21 | 中译语通科技股份有限公司 | A kind of multilingual pair of neural network machine interpretation method and system |
CN109241536A (en) * | 2018-09-21 | 2019-01-18 | 浙江大学 | It is a kind of based on deep learning from the sentence sort method of attention mechanism |
CN109271646A (en) * | 2018-09-04 | 2019-01-25 | 腾讯科技(深圳)有限公司 | Text interpretation method, device, readable storage medium storing program for executing and computer equipment |
-
2019
- 2019-01-31 CN CN201910079518.1A patent/CN109635269B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107301173A (en) * | 2017-06-22 | 2017-10-27 | 北京理工大学 | A kind of automatic post-editing system and method for multi-source neutral net that mode is remixed based on splicing |
CN107967262A (en) * | 2017-11-02 | 2018-04-27 | 内蒙古工业大学 | A kind of neutral net covers Chinese machine translation method |
CN108563640A (en) * | 2018-04-24 | 2018-09-21 | 中译语通科技股份有限公司 | A kind of multilingual pair of neural network machine interpretation method and system |
CN109271646A (en) * | 2018-09-04 | 2019-01-25 | 腾讯科技(深圳)有限公司 | Text interpretation method, device, readable storage medium storing program for executing and computer equipment |
CN109241536A (en) * | 2018-09-21 | 2019-01-18 | 浙江大学 | It is a kind of based on deep learning from the sentence sort method of attention mechanism |
Also Published As
Publication number | Publication date |
---|---|
CN109635269A (en) | 2019-04-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhou et al. | Deep semantic dictionary learning for multi-label image classification | |
CN111126069B (en) | Social media short text named entity identification method based on visual object guidance | |
Yao et al. | Describing videos by exploiting temporal structure | |
CN108897740A (en) | A kind of illiteracy Chinese machine translation method based on confrontation neural network | |
CN111046178B (en) | A text sequence generation method and system thereof | |
CN110188348B (en) | Chinese language processing model and method based on deep neural network | |
CN111274790B (en) | Text-level event embedding method and device based on syntactic dependency graph | |
CN109635269B (en) | Method and device for post-translation editing of machine-translated text | |
CN110516530A (en) | An image description method based on non-aligned multi-view feature enhancement | |
CN111985205A (en) | An aspect-level sentiment classification model | |
Lin et al. | Deep structured scene parsing by learning with image descriptions | |
CN113157919B (en) | Sentence Text Aspect-Level Sentiment Classification Method and System | |
Liu et al. | Generating questions for knowledge bases via incorporating diversified contexts and answer-aware loss | |
CN111027292B (en) | A method and system for generating a limited sampling text sequence | |
CN107066451A (en) | The update method of man-machine interaction translation model and more new system | |
CN114627282A (en) | Target detection model establishing method, target detection model application method, target detection model establishing device, target detection model application device and target detection model establishing medium | |
CN115908641A (en) | Text-to-image generation method, device and medium based on features | |
CN114995729A (en) | Voice drawing method and device and computer equipment | |
CN116841893A (en) | Improved GPT 2-based automatic generation method and system for Robot Framework test cases | |
CN117113094A (en) | Semantic progressive fusion-based long text similarity calculation method and device | |
CN114021572B (en) | Natural language processing method, device, equipment and readable storage medium | |
CN115690276A (en) | Video generation method and device of virtual image, computer equipment and storage medium | |
CN110491372A (en) | A kind of feedback information generating method, device, storage medium and smart machine | |
He et al. | Image captioning algorithm based on multi-branch cnn and bi-lstm | |
Vijayaraju | Image retrieval using image captioning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |