CN113515960A - Automatic translation quality evaluation method fusing syntactic information - Google Patents

Automatic translation quality evaluation method fusing syntactic information Download PDF

Info

Publication number
CN113515960A
CN113515960A CN202110797021.0A CN202110797021A CN113515960A CN 113515960 A CN113515960 A CN 113515960A CN 202110797021 A CN202110797021 A CN 202110797021A CN 113515960 A CN113515960 A CN 113515960A
Authority
CN
China
Prior art keywords
graph
bilingual
neural network
model
syntactic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110797021.0A
Other languages
Chinese (zh)
Other versions
CN113515960B (en
Inventor
陆晓蕾
倪斌
韩潮
张培欣
管新潮
李力
陈晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen University
Original Assignee
Xiamen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen University filed Critical Xiamen University
Priority to CN202110797021.0A priority Critical patent/CN113515960B/en
Publication of CN113515960A publication Critical patent/CN113515960A/en
Application granted granted Critical
Publication of CN113515960B publication Critical patent/CN113515960B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

A method for automatically evaluating translation quality by fusing syntactic information relates to the technical field of translation. The method comprises the following steps: acquiring a bilingual text representation direction of an input text; respectively constructing a syntax dependency tree for a bilingual input text to form a syntax diagram; coding relevant node relation characteristics by using a graph neural network, splicing, and outputting a quality score by an upper layer of a simple sigmoid layer; the output of the model and the root mean square error of the data label are used as loss, and the quality prediction model parameters are updated through a back propagation algorithm. The problem of introduction of syntactic information lack in automatic translation quality evaluation is solved skillfully by using the graph neural network, and the method is not seen in the field of automatic translation quality evaluation. On the basis of the pre-training model, the graph neural network coding syntactic information is added, so that the model can express the semantic and syntactic information at the same time, and the effect of about 19% of the Pearson correlation coefficient can be generally improved compared with the effect of using the pre-training model alone.

Description

Automatic translation quality evaluation method fusing syntactic information
Technical Field
The invention relates to the technical field of translation, in particular to an automatic evaluation method for translation quality by fusing syntactic information.
Background
With the development of neural machine translation and natural language technology, how to automatically perform quantitative assessment (quality estimation, QE) on translation quality has attracted extensive attention in business and academic circles. The translation quality can be estimated without reference to the translation text based on big data driven translation automatic evaluation. Currently, QE methods can be mainly classified into three categories: firstly, based on characteristic engineering; based on neural network; based on pre-training model; firstly, inputting a traditional machine learning algorithm after manually constructing characteristics, wherein the method is represented by Quest and Quest + +, and has the defects of limited performance and difficulty in processing new language phenomena; the method II is usually two-segment type, a bilingual model is trained on the basis of a large amount of parallel linguistic data to obtain word expression, and then the word expression is input into an upper neural network (such as LSTM), and the method is represented by Predictor-Estimator, and has the defects that a large amount of parallel linguistic data are needed and the training time is long; the method is more common in two years, and a better effect can be obtained by connecting a simple full connection layer on a multi-language pre-training model (such as mBERT and XLM-R). It can be seen from the inspection of these techniques that the current technology mainly uses bilingual semantic features to estimate the translation quality. However, the grammatical features are rarely taken into account in the translation quality estimation, restricting the effect of the model.
Therefore, the method and the device can improve the effect of translation quality estimation to a certain extent by incorporating the syntactic information into the model through the graph neural network on the basis of the pre-training model.
Disclosure of Invention
The invention aims to provide an automatic evaluation method for translation quality fusing syntactic information, which can improve the estimation effect of translation quality.
The invention comprises the following steps:
1) acquiring a bilingual text representation direction of an input text;
2) respectively constructing a syntax dependency tree for a bilingual input text to form a syntax diagram;
3) coding relevant node relation characteristics by using a graph neural network, splicing, and outputting a quality score by an upper layer of a simple sigmoid layer;
4) the output of the model and the root mean square error of the data label are used as loss, and the quality prediction model parameters are updated through a back propagation algorithm.
In step 1), the specific method for acquiring the bilingual text representation direction of the input text may be:
(1) a bilingual pre-training model can be adopted to obtain the bilingual text representation direction of the input text; the bilingual pre-training model comprises XLM-R or mBERT; parameter fine adjustment can be carried out in the model training process;
(2) the method using Word2 Vec;
(3) the word vector representation layer is built using a trained model obtained using the open source toolkit transformations.
In step 2), the constructing syntax dependency trees for the bilingual input texts respectively may be performed by: extracting the syntax dependency relationship of bilingual input by using a self-built syntax dependency algorithm or an open source toolkit, such as NLTK or spaCy and the like; the dependency relationship between sentence components is represented by a directed graph; the dependency graph contains nodes and relationship types among the nodes, and is represented by a triple, such as: node a, relationship r, node B. Thus, the syntactic dependency of the whole sentence is encoded into a triple list, [ triple 1, triple 2, triple 3, …, triple n ]; then, converting the ternary group list into a matrix form by using an adjacency matrix; the adjacency matrix is a two-dimensional array of V, wherein V is the number of nodes in the graph; let adj [ ] [ ] be the adjacency matrix, then:
Figure BDA0003163195430000021
wherein, (vi, vj) represents the edge from node i to node j; if (vi, vj) does not exist, then adj [ i ] [ j ] is assigned a value of 0; if (vi, vj) is present or i ═ j, then adj [ i ] [ j ] is assigned a value of 1.
In step 3), the coding of the relevant node relation characteristics by using the graph neural network is to apply deep learning to a graph structure by using the graph neural network, code the node relation of the syntactic graph by using the graph neural network, and code bilingual input into implicit vectors Hs and Ht respectively; the graph neural network comprises GNN, graph convolution neural network GCN and GAT.
The upper layer outputs a simple sigmoid layer to output a quality score, and a vector H is obtained after bilingual representation Hs and Ht coded by a graph neural network are spliced; and then connecting a full connection layer as an output layer, wherein the activation function of the output layer obtains an output OUT for Sigmoid, namely:
OUT=Sigmoid(WH+B)
wherein W is a linear transformation parameter and B is a bias term;
the number of neurons in the output layer (i.e. the final output number) is determined according to the specific situation of the task; if the QE is sentence-level, the number of the neurons is 1; if the word level QE is obtained, the number of the neurons is the number of the words.
Compared with the prior art, the invention has the following outstanding advantages and technical effects:
the method skillfully solves the problem of the introduction of the lack of syntactic information in the automatic evaluation of the translation quality by using the graph neural network, and the method is not seen in the field of the automatic evaluation of the translation quality. According to the method, the graph neural network coding syntactic information is added on the basis of the pre-training model, so that the model can express the semantic and syntactic information at the same time, and the effect of about 19% on Pearson correlation coefficients can be generally improved compared with the effect of using the pre-training model alone.
Drawings
FIG. 1 is a diagram of a model architecture of the present invention.
Fig. 2 is the generation of a syntax map.
Detailed Description
The following examples will further illustrate the present invention with reference to the accompanying drawings.
The embodiment of the invention comprises three steps: obtaining a word vector representation; generating a syntactic graph and mapping a neural network; a quality prediction model. These three steps are also shown in the model architecture diagram of fig. 1, and are described below.
The word vector represents:
first, an input bilingual text representation vector needs to be obtained. The present invention proposes to use bilingual pre-training models, such as XLM-R (Conneau et al, 2019) or mBERT (Devrin et al, 2018), to obtain word vector representations of the input text and to perform parameter tuning during the model training process. Of course, the method using Word2Vec is also possible. The word vector representation layer can be built using a model obtained with the open source toolkit transforms (ref: https:// github. com/hugging face/transforms) trained. As shown in FIG. 1, the Source language Source and the Target language Target respectively obtain a word vector E after passing through XLM-R1,E2,…,EnAnd F1,F2,Fm
Syntax tree generation and graph neural network:
this section is the core part of the present invention. It mainly contains two large pieces of content: syntactic graph generation and graph neural networks.
Generating a syntactic graph
And extracting the syntactic dependency relationship of the bilingual input by using a self-built syntactic dependency algorithm or an open source toolkit, such as NLTK or spaCy. The dependency between sentence components can be represented using a directed graph (fig. 2). The dependency graph contains nodes and relationship classes between the nodes, which can be represented by triples, such as (node a, relationship r, node B). This encodes the syntactic dependency of the entire sentence into a list of triples, [ triplet 1, triplet 2, triplet 3, …, triplet n ]. The ternary list is then converted to matrix form using an adjacency matrix. The adjacency matrix is a two-dimensional array of V, where V is the number of nodes in the graph. Let adj [ ] [ ] be the adjacency matrix, then:
Figure BDA0003163195430000041
wherein, (vi, vj) represents the edge from node i to node j. If (vi, vj) does not exist, then adj [ i ] [ j ] is assigned a value of 0. If (vi, vj) is present or i ═ j, then adj [ i ] [ j ] is assigned a value of 1.
Graph neural network
The graph neural network may apply deep learning to the graph structure. Node relations of the syntactic Graph can be encoded through a Graph neural network, the Graph neural network is commonly used as GNN, and Graph convolution neural networks (Graph dd) GCN and GAT can encode the node relations. The present embodiment takes GAT as an example. The GAT adopts a Multi-head Attention mechanism (Multi-heads Attention), different weights can be distributed to different nodes, and training depends on paired adjacent nodes and does not depend on a specific network structure. Assume that Graph contains N nodes, and the feature vector of each node (obtained in 2.1) is
Figure BDA0003163195430000042
Then the output of GAT
Figure BDA0003163195430000043
As follows:
Figure BDA0003163195430000044
where K represents the total number of heads of attention,
Figure BDA0003163195430000045
indicating the kth attention head for node i and node j. WkIs a linear transformation parameter. Ni represents the neighbor node of i. σ denotes the activation function, typically using a modified linear unit function with leakage (LeakyReLU). With this step, the bilingual input is divided, as in FIG. 1Respectively encoded into implicit vectors Hs and Ht.
A quality prediction model:
and splicing the bilingual representation Hs and Ht coded by the neural network of the graph to obtain a vector H ═ Hs: Ht. After that, a full connection layer is connected as an output layer, the activation function is Sigmoid to obtain an output OUT, that is:
OUT=Sigmoid(WH+B)
where W is the linear transformation parameter and B is the bias term. The number of output layer neurons (i.e., the final number of outputs) depends on the specifics of the task. For example: if the QE is sentence-level, the number of the neurons is 1; if the word level QE is obtained, the number of the neurons is the number of the words. The output of the model is used as a quality score, the root mean square error of the label of the data is used as a loss, and the whole model parameter is updated through a back propagation algorithm.
The invention provides a translation quality estimation technology fusing syntax, which is characterized in that graph neural network coding syntax information is added on the basis of a pre-training model, so that the model can express semantics and syntax information at the same time. Relevant experiments are carried out in a QE data set of International machine translation tournament (WMT2020) in 2020, and results show that the effect of adding syntactic information can be generally improved by about 19% on Pearson (Pearson) correlation coefficients compared with a simple pre-training model in comparison with a single pre-training model. Therefore, the scheme of the invention can improve the estimation effect of the translation quality and has better effect.

Claims (6)

1. A translation quality automatic evaluation method fusing syntactic information is characterized by comprising the following steps:
1) acquiring a bilingual text representation direction of an input text;
2) respectively constructing a syntax dependency tree for a bilingual input text to form a syntax diagram;
3) coding relevant node relation characteristics by using a graph neural network, splicing, and outputting a quality score by an upper layer of a simple sigmoid layer;
4) the output of the model and the root mean square error of the data label are used as loss, and the quality prediction model parameters are updated through a back propagation algorithm.
2. The method as claimed in claim 1, wherein in step 1), the specific method for obtaining the bilingual text representation direction of the input text is one of the following:
(1) acquiring a bilingual text representation direction of an input text by adopting a bilingual pre-training model; the bilingual pre-training model comprises XLM-R or mBERT; parameter fine adjustment can be carried out in the model training process;
(2) the method using Word2 Vec;
(3) the word vector representation layer is built using a trained model obtained using the open source toolkit transformations.
3. The method as claimed in claim 1, wherein in step 2), the method for automatically evaluating translation quality by fusing syntactic information constructs a syntactic dependency tree from the bilingual input text respectively, and the specific method for forming the syntactic graph is as follows: extracting a syntax dependency relationship of bilingual input by using a self-built syntax dependency algorithm or an open source toolkit; the dependency relationship between sentence components is represented by a directed graph; the dependency graph contains nodes and relationship types among the nodes, and is represented by a triple, such as: node A, relationship r, node B; thus, the syntactic dependency of the whole sentence is encoded into a triple list, [ triple 1, triple 2, triple 3, …, triple n ]; then, converting the ternary group list into a matrix form by using an adjacency matrix; the adjacency matrix is a two-dimensional array of V, wherein V is the number of nodes in the graph; let adj [ ] [ ] be the adjacency matrix, then:
Figure FDA0003163195420000011
wherein, (vi, vj) represents the edge from node i to node j; if (vi, vj) does not exist, then adj [ i ] [ j ] is assigned a value of 0; if (vi, vj) is present or i ═ j, then adj [ i ] [ j ] is assigned a value of 1.
4. The method as claimed in claim 1, wherein in step 3), the coding of the relevant node relation features by the graph neural network is to apply deep learning to the graph structure by using the graph neural network, the node relation of the syntactic graph is coded by the graph neural network, and the bilingual input is coded into implicit vectors Hs and Ht, respectively; the graph neural network comprises GNN, graph convolution neural network GCN and GAT.
5. The method as claimed in claim 1, wherein in step 3), the upper layer outputs a simple sigmoid layer to splice the bilingual representation Hs and Ht encoded by the neural network to obtain a vector H ═ Hs: ht ]; and then connecting a full connection layer as an output layer, wherein the activation function of the output layer obtains an output OUT for Sigmoid, namely:
OUT=Sigmoid(WH+B)
where W is the linear transformation parameter and B is the bias term.
6. The method as claimed in claim 5, wherein the number of neurons in the output layer is determined according to the specific task; if the QE is sentence-level, the number of the neurons is 1; if the word level QE is obtained, the number of the neurons is the number of the words.
CN202110797021.0A 2021-07-14 2021-07-14 Automatic translation quality assessment method integrating syntax information Active CN113515960B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110797021.0A CN113515960B (en) 2021-07-14 2021-07-14 Automatic translation quality assessment method integrating syntax information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110797021.0A CN113515960B (en) 2021-07-14 2021-07-14 Automatic translation quality assessment method integrating syntax information

Publications (2)

Publication Number Publication Date
CN113515960A true CN113515960A (en) 2021-10-19
CN113515960B CN113515960B (en) 2024-04-02

Family

ID=78066996

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110797021.0A Active CN113515960B (en) 2021-07-14 2021-07-14 Automatic translation quality assessment method integrating syntax information

Country Status (1)

Country Link
CN (1) CN113515960B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116720531A (en) * 2023-06-20 2023-09-08 内蒙古工业大学 Mongolian neural machine translation method based on source language syntax dependency and quantization matrix

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170139905A1 (en) * 2015-11-17 2017-05-18 Samsung Electronics Co., Ltd. Apparatus and method for generating translation model, apparatus and method for automatic translation
CN110688861A (en) * 2019-09-26 2020-01-14 沈阳航空航天大学 Multi-feature fusion sentence-level translation quality estimation method
CN111597830A (en) * 2020-05-20 2020-08-28 腾讯科技(深圳)有限公司 Multi-modal machine learning-based translation method, device, equipment and storage medium
CN112347795A (en) * 2020-10-04 2021-02-09 北京交通大学 Machine translation quality evaluation method, device, equipment and medium
CN112464676A (en) * 2020-12-02 2021-03-09 北京捷通华声科技股份有限公司 Machine translation result scoring method and device
CN112613326A (en) * 2020-12-18 2021-04-06 北京理工大学 Tibetan language neural machine translation method fusing syntactic structure
CN113033218A (en) * 2021-04-16 2021-06-25 沈阳雅译网络技术有限公司 Machine translation quality evaluation method based on neural network structure search

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170139905A1 (en) * 2015-11-17 2017-05-18 Samsung Electronics Co., Ltd. Apparatus and method for generating translation model, apparatus and method for automatic translation
CN110688861A (en) * 2019-09-26 2020-01-14 沈阳航空航天大学 Multi-feature fusion sentence-level translation quality estimation method
CN111597830A (en) * 2020-05-20 2020-08-28 腾讯科技(深圳)有限公司 Multi-modal machine learning-based translation method, device, equipment and storage medium
CN112347795A (en) * 2020-10-04 2021-02-09 北京交通大学 Machine translation quality evaluation method, device, equipment and medium
CN112464676A (en) * 2020-12-02 2021-03-09 北京捷通华声科技股份有限公司 Machine translation result scoring method and device
CN112613326A (en) * 2020-12-18 2021-04-06 北京理工大学 Tibetan language neural machine translation method fusing syntactic structure
CN113033218A (en) * 2021-04-16 2021-06-25 沈阳雅译网络技术有限公司 Machine translation quality evaluation method based on neural network structure search

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陆晓蕾等: ""计算语言学中的重要术语——词向量"", 《中国科技术语》, vol. 22, no. 3 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116720531A (en) * 2023-06-20 2023-09-08 内蒙古工业大学 Mongolian neural machine translation method based on source language syntax dependency and quantization matrix
CN116720531B (en) * 2023-06-20 2024-05-28 内蒙古工业大学 Mongolian neural machine translation method based on source language syntax dependency and quantization matrix

Also Published As

Publication number Publication date
CN113515960B (en) 2024-04-02

Similar Documents

Publication Publication Date Title
CN110929030B (en) Text abstract and emotion classification combined training method
CN110069790B (en) Machine translation system and method for contrasting original text through translated text retranslation
CN110738057B (en) Text style migration method based on grammar constraint and language model
CN110210032B (en) Text processing method and device
CN110516244B (en) Automatic sentence filling method based on BERT
CN109933808B (en) Neural machine translation method based on dynamic configuration decoding
CN112989796B (en) Text naming entity information identification method based on syntactic guidance
CN113435211B (en) Text implicit emotion analysis method combined with external knowledge
CN114218379B (en) Attribution method for question answering incapacity of intelligent question answering system
CN112926337B (en) End-to-end aspect level emotion analysis method combined with reconstructed syntax information
WO2023137911A1 (en) Intention classification method and apparatus based on small-sample corpus, and computer device
CN114692602A (en) Drawing convolution network relation extraction method guided by syntactic information attention
CN115204143B (en) Method and system for calculating text similarity based on prompt
CN113609284A (en) Method and device for automatically generating text abstract fused with multivariate semantics
CN114218928A (en) Abstract text summarization method based on graph knowledge and theme perception
CN115374270A (en) Legal text abstract generation method based on graph neural network
CN114048301B (en) Satisfaction-based user simulation method and system
CN117033602A (en) Method for constructing multi-mode user mental perception question-answering model
Mandal et al. Futurity of translation algorithms for neural machine translation (NMT) and its vision
CN113515960A (en) Automatic translation quality evaluation method fusing syntactic information
CN114595700A (en) Zero-pronoun and chapter information fused Hanyue neural machine translation method
CN114218936A (en) Automatic generation algorithm for high-quality comments in media field
CN117251562A (en) Text abstract generation method based on fact consistency enhancement
CN111382333B (en) Case element extraction method in news text sentence based on case correlation joint learning and graph convolution
CN115017924B (en) Construction of neural machine translation model for cross-language translation and translation method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant