CN114896992A - Method, medium, and apparatus for improving automatic evaluation of machine translation quality using search - Google Patents

Method, medium, and apparatus for improving automatic evaluation of machine translation quality using search Download PDF

Info

Publication number
CN114896992A
CN114896992A CN202210460184.4A CN202210460184A CN114896992A CN 114896992 A CN114896992 A CN 114896992A CN 202210460184 A CN202210460184 A CN 202210460184A CN 114896992 A CN114896992 A CN 114896992A
Authority
CN
China
Prior art keywords
machine translation
translation quality
word
evaluated
quality evaluation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210460184.4A
Other languages
Chinese (zh)
Inventor
黄书剑
郑鑫
赵千锋
戴新宇
张建兵
陈家骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202210460184.4A priority Critical patent/CN114896992A/en
Publication of CN114896992A publication Critical patent/CN114896992A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method, a medium and equipment for improving automatic evaluation of machine translation quality by utilizing retrieval, wherein the method comprises the following steps: and a retrieval stage: for a machine translation quality evaluation sentence pair, searching a related parallel sentence pair for a word to be evaluated in the machine translation quality evaluation sentence pair in a database; and (3) a machine translation quality evaluation stage: and encoding the retrieved parallel sentence pairs, and fusing the parallel sentence pairs into a machine translation quality evaluation model. The method can directly and effectively utilize related parallel sentence pairs, and simultaneously relieves the problem of sparse machine translation quality evaluation training data; the reason why the model makes relevant decisions is better explained; the model does not need to be retrained; the defect that the end-to-end model forgets training data in the training process is avoided, and the performance of the machine translation quality evaluation model is improved.

Description

Method, medium, and apparatus for improving automatic evaluation of machine translation quality using search
Technical Field
The invention relates to the technical field of natural language processing, in particular to a method, a medium and equipment for improving automatic evaluation of machine translation quality by utilizing retrieval.
Background
With the accelerated development of the global trend of today's society, machine translation technology becomes crucial as a bridge to connect different languages. However, the quality of machine translation is still a little different from that of human translation, so it needs to be evaluated to help human better utilize the result of machine translation. The quality evaluation is also carried out by a machine, namely, the automatic evaluation of the machine translation quality is carried out. In recent years, automatic evaluation of machine translation quality has received more and more attention, and has become a widely studied and discussed problem in the field of machine translation.
Currently, the automatic assessment of machine translation quality can be roughly divided into two directions, the first is the assessment of machine translation quality with reference translation, i.e. by comparing the output of the machine translation system with the reference translation to give a quantifiable index, classical works such as BLEU [ Kishore Papineni, Salim Roukos, Todd Ward, and Wei-jin Zhu.2002.BLEU: a method for automatic evaluation of machine translation (BLEU: a method for automatic evaluation of machine translation) Proceedings of the 40th Annual Meeting on Association for computerized Linear: 311-318. METEOR [ documents Alon Lavie and Abhaya Agarwal.2007.Meteor: an automatic measurement for MT evaluation with high levels of correlation with human judgment, (Meteor: an automatic index for machine translation evaluation with high correlation with human judgment.) Proceedings of the 45th Annual Meeting on Association for computerized Linear constraints: 228-231. etc.
And more generally in the second direction, i.e., machine translation quality assessment without reference translation. The quality of the translation is automatically judged by a quality assessment system given only one bilingual sentence pair, including a source sentence and a translation of a machine translation system. The evaluation of the translation quality in the task respectively comprises a word level and a sentence level, wherein the quality evaluation of the word level needs to judge the translation quality of each word in the translation, and the sentence level needs to give a score to explain the integral quality of the translation like the direction. At present, the task is realized by manually marking some examples, regarding the problem as a supervised learning and predicting task, and then learning through a deep network model.
At present, effective labeling data for machine translation quality evaluation is very rare, and in an open data set, the data of one language pair is only thousands of pieces. To alleviate the problem of data sparsity, current researchers attempt to assist in the training of the machine translation quality assessment task with external resources. There are two general approaches currently in use, one is to use a large-scale cross-language pre-training model, take the pre-training model as a basis, and then use supervised data for machine translation quality assessment for fine-tuning [ document Tharindu Ranasinghe, Constantin Orasan, and Ruslan mitkov.2020.transquest: transformation Quality Estimation with Cross-language transforms (TransQuest: use of Trans-language transforms for Translation Quality assessment) Proceedings of the 28th International Conference on comparative simulations: 5070-5081. the literature Tharindu Ranasinghe, Constantin Orasan, and Ruslan Mitkov.2021. An expression Analysis of Multilingual Word-Level assessment with Cross-Linual transformations (Exploratory Analysis of Multilingual Word-Level Quality assessment with Trans-Language transformations.) Proceedings of the 59th annular recording of the Association for the computerized linearity and the 11th International Joint Conference Natural Language Processing: 434-440. h ]; the second is to supplement the original data by parallel corpora with the help of massive parallel corpora to generate forged Machine Translation Quality assessment data [ Cui, Qu, Shujian Huang, Jian Li, Xiao Geng, Zaixiang Zheng, sizing Huang, and Jianjun Chem. "Direct prediction of the AAAI correlation on engineering analysis, vol.35, 14, pp.12719-12727.2021. Zong, Yonhang, Zhixing Tan, Zhang, Mieralization, linking, Yang testing, and testing for Machine Translation Quality assessment)" based on the method of Quality assessment by the native machines of Translation, Quality assessment by the parallel corpora, library, Quality assessment by the parallel corpora.
Although both methods effectively improve the performance of machine translation quality evaluation, they are still completely based on the deep neural network of black boxes to perform end-to-end training and prediction, thus having some defects in interpretability and flexibility, and simultaneously having incomplete utilization of external parallel data.
First, interpretability is provided, for the quality evaluation of machine translation, when an evaluation model gives a translated word OK or BAD, a person actually wants to give a reason for making a judgment, so that the reliability of the prediction given by the model can be higher, and the judgment made by the person on the model can be better accepted. However, it is obvious that the black box property of the current deep neural model makes the current model have a large deficiency in interpretability of output results.
Secondly, flexibility is achieved, the current machine translation quality evaluation model is usually applied to all subsequent machine translation quality evaluations after being trained once, and the method has limitations in some scenes. Consider the following scenario: the model is trained on the quality assessment data in the news domain, and then test assessment is performed on the data in the medical domain. It is evident that existing models suffer performance degradation in this scenario. To solve the problem, the model needs to collect relevant samples again to continue training, but on one hand, training the model itself needs a very large overhead, and on the other hand, due to the catastrophic forgetting property of the neural network itself, the model may forget the original knowledge in the training process, so that the model cannot be reused continuously.
Finally, the completeness of the utilization of external parallel data. Although there may be samples in the parallel corpus used for the counterfeit quality assessment data that are relevant to the machine translation currently being assessed, the information of these relevant samples may not be saved in the model during the training process. Therefore, the existing models cannot guarantee complete utilization of the information of the relevant parallel sentence pairs in the external data when they are needed to assist judgment.
Disclosure of Invention
The method aims to overcome the problems that the existing machine translation quality evaluation model in the background art has defects in interpretability, flexibility and completeness of external parallel corpus utilization. The invention provides a method for improving automatic evaluation of machine translation quality by utilizing retrieval, a storage medium and electronic equipment.
In order to achieve the above object, according to a first aspect of the present invention, a method for improving automatic evaluation of machine translation quality by using search includes:
and a retrieval stage: for a machine translation quality evaluation sentence pair, searching a related parallel sentence pair for a word to be evaluated in the machine translation quality evaluation sentence pair in a database;
and (3) a machine translation quality evaluation stage: and encoding the retrieved parallel sentence pairs, and fusing the parallel sentence pairs into a machine translation quality evaluation model.
In some possible embodiments, the retrieval phase comprises the steps of:
step 1, constructing a database by using parallel sentence pairs;
step 2, constructing a query sequence for the words to be evaluated in the machine translation quality evaluation sentence pair in the database for the machine translation quality evaluation sentence pair, and retrieving;
and 3, sequencing the retrieved parallel sentence pairs of the words to be evaluated in the machine translation quality evaluation sentence pairs, and reserving the required parallel sentence pairs.
In some possible embodiments, for all pairs of parallel sentences, the database constructs an inverted index for each word in each pair of parallel sentences, i.e., when searching for any word in a certain pair of parallel sentences, the parallel sentence pair can be retrieved.
In some possible embodiments, the machine translation quality evaluation phase comprises the steps of:
and 4, step 4: coding a machine translation quality evaluation sentence pair and a parallel sentence pair corresponding to a word to be evaluated and retrieved by the machine translation quality evaluation sentence pair respectively by using a cross-language pre-training model, and respectively obtaining hidden layer representation of the word to be evaluated in the machine translation quality evaluation sentence pair and hidden layer representation of the retrieved parallel sentence pair;
and 5: splicing hidden layer representations obtained by the searched parallel sentence pairs corresponding to the word to be evaluated;
step 6: extracting information of the hidden layer representation of the word to be evaluated by using the machine translation quality evaluation sentence by using multi-head attention of the hidden layer representation spliced in the step 5;
and 7: and (4) fusing the hidden layer representation of the word to be evaluated in the machine translation quality evaluation sentence pair and the hidden layer representation extracted based on the information in the step (6) through a gate control mechanism, inputting the obtained final representation into a multilayer perceptron for classification, and obtaining the translation accuracy of the word to be evaluated.
In some possible embodiments, the retrieval phase comprises in particular the following steps:
step 11: the database constructs a search engine through a Lucene structure, the Lucene constructs an index by using an FST structure, an index path is shared by words with the same prefix, namely a parallel sentence pair is given, and the Lucene constructs an inverted index for each word of the parallel sentence pair comprising a source language and a target language;
step 12: assuming that the machine translation quality assessment sentence pair is: x ═ X 1 ,...,x i ,...,x m ) The translation is: y ═ Y 1 ,...,y j ,...,y n ) Evaluating the word y to be evaluated in sentence pairs for machine translation quality j The query sequence is:
MUST(y j )∧SHOULD(x 1 )∧...∧SHOULD(x m )∧SHOULD(y 1 )∧...∧SHOULD(y n );
step 13: for the word to be evaluated, all the retrieved parallel sentence pairs are sorted by using the BM25, and top-k results in the required word are reserved.
In some possible embodiments, the machine translation quality evaluation phase specifically includes the following steps:
step 44: encoding the machine translation quality assessment sentence pair by XLMR to obtain:
Figure RE-GDA0003737184810000061
for the word y to be evaluated j Extracting it in h MT Hidden layer representation of middle corresponding position
Figure RE-GDA0003737184810000062
In addition, for the word y to be evaluated j The corresponding retrieved parallel sentence pairs are:
Figure RE-GDA0003737184810000063
respectively reacting R with 1 To R k Encoding by XLMR according to the same mode, and respectively obtaining the hidden layer state of the last layer to obtain:
Figure RE-GDA0003737184810000064
step 55: will be provided with
Figure RE-GDA0003737184810000065
To is that
Figure RE-GDA0003737184810000066
Spliced together to obtain
Figure RE-GDA0003737184810000067
And step 66: will be y described in step 44 j Corresponding hidden layer representation
Figure RE-GDA0003737184810000068
To h R Extracting information by Multihead Attention to obtain h R-Extract Said
Figure RE-GDA0003737184810000069
Step 77: will be provided with
Figure RE-GDA00037371848100000610
And h R-Extract H is obtained by fusing through a gate control mechanism final H is to be final Inputting the input into a multi-layer perceptron for classification, and outputting OK/BAD, namely the word y to be evaluated j Correct or incorrect translation.
In some possible embodiments, the gating mechanism is:
Figure RE-GDA00037371848100000611
the above-mentioned
Figure RE-GDA00037371848100000612
In some possible embodiments, parallel sentence pairs containing the current evaluation word are not retrieved from the parallel corpus, and h is then the time h R-Extract The value is 0, the g value output by the gating mechanism model is also close to 0, namely the model does not use the retrieval result; after the fused representation is obtained, h is combined in the same manner as the baseline model final And inputting the data into a multi-layer perceptron for classification.
In a second aspect of the invention, an electronic device for automated assessment of machine translation quality using search boosting is provided, the electronic device comprising a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is used for executing the method for improving the automatic evaluation of the machine translation quality by using word-level fine-grained bilingual retrieval according to the instructions in the program codes.
In a third aspect of the present invention, a computer-readable storage medium is provided, which is configured to store a program code for executing the above-mentioned method for improving automatic evaluation of machine translation quality by using word-level fine-grained bilingual retrieval.
The invention has the beneficial effects that:
1. according to the method for improving the automatic evaluation of the machine translation quality by utilizing the retrieval, the quality evaluation of the machine translation by using the word-level-based fine-grained bilingual retrieval assistance is firstly provided, and the method can directly and effectively utilize related parallel sentence pairs and simultaneously relieve the problem of sparse training data of the machine translation quality evaluation.
2. Compared with the existing scheme, the method for improving the automatic evaluation of the machine translation quality by utilizing retrieval has better interpretability, and the judgment of the model in the invention is directly from the sample obtained by retrieval, so that the reason for making relevant decisions by the model is better explained.
3. Compared with the prior art, the method for improving the automatic evaluation of the machine translation quality by utilizing the retrieval has better flexibility for the utilization mode of the parallel sentence pairs. In the invention, the model can be adapted to the machine translation quality evaluation of a new scene only by replacing the database used for retrieval, and the model does not need to be retrained like the existing method.
4. The invention utilizes the method for automatically evaluating the translation quality of a retrieval and promotion machine to more completely utilize the external parallel sentence pairs. In the present invention, all pairs of parallel sentences are stored in the database and can be retrieved. The defect that the end-to-end model possibly forgets training data in the training process is avoided, and the performance of the machine translation quality evaluation model is improved.
Drawings
FIG. 1 is a flowchart illustrating the steps of a retrieval phase in a method for improving automatic evaluation of machine translation quality by retrieval according to an embodiment of the present disclosure;
FIG. 2 is a flowchart illustrating steps in a machine translation quality evaluation phase of a method for improving automatic evaluation of machine translation quality by using search according to an embodiment of the present disclosure;
FIG. 3 is a schematic representation of XLMR in example 1 of the present application;
FIG. 4 is a schematic view of a model in example 1 of the present application;
FIG. 5 is a schematic view of a model in example 2 of the present application;
fig. 6 is a block diagram of an electronic device for automatic evaluation of machine translation quality using search enhancement in an embodiment of the present application.
In the figure: 50. an electronic device; 51. a processor; 52. a memory.
Detailed Description
The following detailed description of the preferred embodiments of the present invention, taken in conjunction with the accompanying drawings, will make the advantages and features of the invention easier to understand by those skilled in the art, and thus will clearly and clearly define the scope of the invention.
Existing machine translation quality assessment models are deficient in interpretability, flexibility, and completeness of utilization of external parallel sentences. The embodiment of the invention provides a method for assisting automatic evaluation of machine translation quality by using retrieval parallel data aiming at the problems.
The invention provides a novel word-level-based fine-grained bilingual retrieval scheme by combining the characteristics of a machine translation quality evaluation task; and by combining the characteristics of the existing machine translation quality evaluation model, a simple and effective mechanism for integrating the retrieval result into word-level machine translation quality evaluation is provided.
The work of the invention mainly focuses on word-level machine translation quality evaluation of the reference-free translation. Specifically, given a source language machine translation quality assessment sentence pair of X ═ X (X) 1 ,…,x i ,…,x m ) (wherein x i Representing the ith word in the source language sentence, m represents the length of the sentence), and the translation given by the machine translation system is Y ═ (Y 1 ,…,y j ,…,y n ) (wherein y is j Representing the jth word in a source language sentence, n representing the length of the sentence) A machine translation evaluation system is needed to give the word Y to be evaluated in the translation Y j If it is OK, then BAD.
In a first aspect of this embodiment, a method for improving automatic assessment of machine translation quality by using search includes:
and (3) a retrieval stage: for a machine translation quality evaluation sentence pair, searching a related parallel sentence pair for a word to be evaluated in the machine translation quality evaluation sentence pair in a database; for all parallel sentence pairs, the database constructs an inverted index for each word in each parallel sentence pair, i.e., when searching for any word in a certain parallel sentence pair, the parallel sentence pair can be retrieved.
The retrieval phase comprises the steps of:
step 1, constructing a database by using parallel sentence pairs;
step 2, constructing a query sequence for the words to be evaluated in the machine translation quality evaluation sentence pair in the database for the machine translation quality evaluation sentence pair, and retrieving;
and 3, sequencing the retrieved parallel sentence pairs of the words to be evaluated in the machine translation quality evaluation sentence pairs, and reserving the required parallel sentence pairs.
And (3) a machine translation quality evaluation stage: and encoding the retrieved parallel sentence pairs, and fusing the parallel sentence pairs into a machine translation quality evaluation model.
The machine translation quality evaluation phase comprises the following steps:
step 4, coding the machine translation quality evaluation sentence pair and the parallel sentence pair corresponding to the word to be evaluated and retrieved by the machine translation quality evaluation sentence pair by using a cross-language pre-training model respectively, and obtaining the hidden layer representation of the word to be evaluated in the machine translation quality evaluation sentence pair and the hidden layer representation of the retrieved parallel sentence pair respectively;
step 5, splicing hidden layer representations obtained by the retrieved parallel sentence pairs corresponding to the word to be evaluated;
step 6: using the hidden layer representation of the words to be evaluated in the machine translation quality evaluation sentence pair to extract information from the hidden layer representation spliced in the step 5 by using multi-head attention;
and 7: and (4) fusing the hidden layer representation of the word to be evaluated in the machine translation quality evaluation sentence pair and the hidden layer representation extracted based on the information in the step (6) through a gate control mechanism, inputting the obtained final representation into a multilayer perceptron for classification, and obtaining the translation accuracy of the word to be evaluated.
The method for improving the automatic evaluation of the machine translation quality by utilizing retrieval specifically comprises the following steps:
the retrieval phase specifically comprises the following steps:
step 11: the database constructs a search engine through a Lucene structure, the Lucene constructs an index by using an FST structure, an index path is shared by words with the same prefix, namely a bilingual parallel sentence pair is given, and the Lucene constructs an inverted index for each word of the bilingual parallel sentence pair, including a source language and a target language; lucene (literature) was used
Figure RE-GDA0003737184810000101
A search engine was constructed from Andrzej, Robert Muir, Grant Ingersoll, and Lucid imagination, "Apache lucene 4," In SIGIR 2012works hop on open source information retrieval, p.17.2012 ").
When any word in the parallel sentence pair is used for searching in a search engine, the sentence pair can be quickly found. Lucene constructs an index using an FST (Finite State Transducer) structure, sharing an index path for words with the same prefix. For example, when the word "source" is input, Lucene searches the storage structures in sequence according to the sequence of "s- > o- > u- > r- > c-e", and finally finds a parallel sentence pair containing "source".
Step 12: assuming that the machine translation quality assessment sentence pair is: x ═ X 1 ,...,x i ,...,x m ) The translation is: y ═ Y 1 ,...,y j ,...,y n ) Evaluating the word y to be evaluated in sentence pairs for machine translation quality j The query sequence is:
MUST(y j )∧SHOULD(x 1 )∧...∧SHOULD(x m )∧SHOULD(y 1 )∧...∧SHOULD(y n );
i.e. the search result must contain y j And contains the remaining words in the sample.
And step 13, sequencing all the finally retrieved parallel sentence pairs by using the BM25, and reserving top-k results in the required parallel sentence pairs.
Use of BM25 [ article ]
Figure RE-GDA0003737184810000111
Jones, Karen, S.Walker, and Stephen E.Robertson.2000.A basic model of information retrieval A probabilistic model of information retrieval A Development and comparative experiments IP&M36 (6), 779-. The method comprises the following specific steps: the BM25 calculates the similarity between two sentences (in this embodiment, the bilingual sentence pairs are spliced into one sentence), which is the optimization of the classic TF-IDF method, and also takes into account the word frequency and the inverse document frequency. The calculation formula of BM25 is as follows:
Figure RE-GDA0003737184810000112
wherein Q is a machine translation quality assessment sentence pair, Q i Representing each word in the pair of machine translation quality assessment sentences, and d represents a bilingual parallel sentence pair, R (q), in the retrieved database i D) is used to calculate q i The frequency of occurrence in d is calculated as follows:
Figure RE-GDA0003737184810000113
W i for calculating q i The frequency of occurrence in the whole database is calculated by the formula:
Figure RE-GDA0003737184810000114
in the above formula, k is 1 、k 2 And b is a predefined co-ordination factor, N denotes q i Total number of occurrences in the database, f i Denotes q i Number of occurrences in d, qf i Denotes q i The number of occurrences in Q, dl represents the length of Q, and avg _ dl represents the average sentence length in the database.
Such a search scheme ensures that the searched parallel sentence pair contains y to be translated j The information needed, and the translation context and the context currently being evaluated are as similar as possible, facilitating better utilization of the retrieved data by the model.
The machine translation quality evaluation stage specifically comprises the following steps:
and step 44: and comprehensively considering the information contained in the pre-training model and the information contained in the retrieval parallel sentence pair. Let the word y to be evaluated in the sentence pair be evaluated for machine translation quality evaluation j The retrieved parallel sentence pairs are:
Figure RE-GDA0003737184810000121
respectively reacting R with 1 To R k Encoding by XLMR according to the same mode, and respectively obtaining the hidden layer state of the last layer to obtain:
Figure RE-GDA0003737184810000122
in addition, the machine translation quality assessment sentence pair is also encoded by XLMR, resulting in:
Figure RE-GDA0003737184810000123
in the most commonly used machine translation quality assessment model at present, source sentence X and translation Y are usually concatenated and input into a cross-language pre-training model XLMR [ Conneau, Alexis, Kartikandewal, Naman Goyal, Vishrav Chaudhary, Guillame Wenzek, Francisco Guzm n, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. "Unsupervised cross sections-linear representation learning at scale. (Unsupervised large-scale cross-language characterization learning)" associated with expression X, Y, and E, after obtaining the hidden state of the last layer, they are input into a multi-layer perception machine for two-classification, as shown in FIG. 3.
Step 55: will be provided with
Figure RE-GDA0003737184810000124
To
Figure RE-GDA0003737184810000125
Are spliced together to obtain
Figure RE-GDA0003737184810000126
And step 66: the word y to be evaluated j At h MT Of the corresponding position
Figure RE-GDA0003737184810000127
To h R By [ documents Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N.Gomez,
Figure RE-GDA0003737184810000128
the MultiHead Attention proposed by Kaiser, and Illia Polosukhin. "Attention is all you needed." Advances in neural information processing systems 30 (2017.) "extracts effective information from the search results to obtain h R-Extract Said
Figure RE-GDA0003737184810000129
Figure RE-GDA00037371848100001210
Step 77: h is to be y And h R-Extract H is obtained by fusing through a gate control mechanism final H is to be final Inputting the input into a multi-layer perceptron for classification, and outputting OK/BAD.
The gating mechanism is as follows:
Figure RE-GDA0003737184810000131
the above-mentioned
Figure RE-GDA0003737184810000132
Sentence pairs containing the current evaluation word cannot be searched in the parallel corpus, and then h is carried out at the moment R-Extract The value is 0, the g value output by the gating mechanism model is also close to 0, namely the model does not use the retrieval result; after the fused representation is obtained, h is combined in the same manner as the baseline model final The data are input into a multi-layer perceptron for classification, and the method is shown with reference to figure 4.
Through the steps, the method can search related parallel sentence pairs for the translation words to be evaluated in a fine-grained manner, and effectively integrate the information in the parallel sentence pairs into the evaluation of the translation words.
Example 1: the translation accuracy of The kettle is judged by assuming that a model needs to evaluate The machine translation quality sentence pair of The otter feeds menly on fish at present. First, the search is performed according to the flow shown in fig. 1. The following search Query is constructed: MUST (kettle) purse … purse (fish purse) purse (mainly) purse … purse (food) and search into a database using parallel sentence pairs. And setting and reserving 2 retrieval results, and obtaining a retrieval parallel sentence pair with the most advanced score according to the BM25 score as follows: the kettle buzzes in a stove. "and" The kettle emits hot air over The stove. "
Next, the retrieved information is incorporated into the quality assessment of the machine translation according to the flow shown in fig. 2. As shown in fig. 4, first, a machine translation quality evaluation sentence pair and a parallel sentence pair retrieved correspondingly to a word to be evaluated are encoded by using a cross-language pre-training model XLM-R, respectively, to obtain a hidden layer representation of the last layer output of the model. The hidden layer of the machine translation quality evaluation sentence pair to be evaluated is represented as follows:
h MT =[h The ,…,h fish ,h water jug ,…,h Food product ];
The hidden representation of the retrieved parallel sentence pair is encoded as:
Figure RE-GDA0003737184810000141
and splicing the hidden layer representations of the retrieved parallel sentence pairs to obtain:
Figure RE-GDA0003737184810000142
reuse h MT The hidden layer of the middle kettle represents h Water jug For the representation h after splicing R Information extraction using multi-head attention yields:
h R-Extract =Multihead-Attention(h water jug ,h R ,h R )。
Then, the hidden layer representation of the 'kettle' and the hidden layer representation extracted based on the retrieval information are fused through gating to obtain a final representation:
h final =g water jug ·h R-Extract +(1-g Water jug )·h Water jug ,g Water jug =MLP([h R-Extract ;h Water jug ])。
Finally, will represent h final Inputting the data into a multi-layer perceptron for classification, and outputting the translation accuracy of the kettle.
In this practical example, the model can know that "kettle" is the correct translation corresponding to "kettle" through the information contained in the retrieved parallel sentence pair, so that the translation of "otter" to "kettle" in the current translation is wrong, and the model outputs OK.
Example 2: suppose that The model currently needs to judge The translation accuracy of The conqueror for The machine translation quality assessment sentence pair "The last conqueror The next with his sword to proceed. First, the search is performed according to the flow shown in fig. 1. The following search Query is constructed:
MUST (the syndrome) SHOULD (the) … SHOULD (draw) SHOULD (last) … SHOULD (go on) and search into a database of structures using parallel sentences. And setting and reserving 2 retrieval results, and obtaining a retrieval sentence pair with the most advanced score according to the BM25 score as follows: "there is conquistador gold on the island. "and" The entry publication of The town way put to The pressed by The conquistador, The residents of The town are subjected to The slaughter of a conquer. "
Next, the retrieved information needs to be incorporated into the quality evaluation of the machine translation with reference to the flow shown in fig. 2. Referring to fig. 5, first, a machine translation quality evaluation sentence pair and a parallel sentence pair retrieved correspondingly to a word to be evaluated are encoded by using a cross-language pre-training model XLM-R, respectively, to obtain a hidden layer representation output by the last layer of the model.
The hidden layer of the machine translation quality evaluation sentence pair to be evaluated is represented as:
h MT =[h The ,…,h drawn ,h finally, the ,…,h Conquer person …,h Go on to ];
The hidden representation of the retrieved parallel sentence pair is encoded as:
Figure RE-GDA0003737184810000151
and splicing the hidden layer representations of the retrieved parallel sentence pairs to obtain:
Figure RE-GDA0003737184810000152
reuse h MT Hidden layer of middle "conquer" represents h Conquer person For the representation h after splicing R Information extraction using multi-head attention yields:
h R-Extract =Multihead-Attention(h conquer person ,h R ,h R );
Then, the hidden layer representation of the "conquer" and the hidden layer representation extracted based on the search information are fused by gating to obtain a final representation:
h final =g conquer person ·h R-Extract +(1-g Conquer person )·h Conquer person ,g Conquer person =MLP([h R-Extract ;h Conquer person ])。
Finally, will represent h final Inputting the data into a multi-layer perceptron for classification, and outputting the translation accuracy of the conquerer.
In this practical example, the model can be known from the information contained in the retrieved pair of parallel sentences that the "conquerer" is the correct translation corresponding to "conquistador". Thus, it is correct to translate "conquistador" into "conqueror" in the current translation, and the model finally outputs OK.
Referring to fig. 6, the present invention provides an electronic device for automatic evaluation of translation quality by using search enhancement, as shown in fig. 6, the electronic device 50 includes a processor 51 and a memory 52 coupled to the processor 51.
The memory 52 is used for storing program codes and transmitting the program codes to the processor 51;
the processor 51 is configured to perform the above-described method for improving the automatic evaluation of the machine translation quality by using the search according to the instructions in the program code.
The memory 52 stores program instructions for implementing the method for improving the automatic evaluation of machine translation quality using search of the above-described embodiment or the method for improving the automatic evaluation of machine translation quality using search of the above-described embodiment.
Processor 51 is operative to execute program instructions stored in memory 52 to perform automated evaluations utilizing the search to improve machine translation quality.
The processor 51 may also be referred to as a CPU (Central Processing Unit). The processor 51 may be an integrated circuit chip having signal processing capabilities. The processor 51 may be:
DSP (Digital Signal Processor, DSP is a Processor composed of large-scale or super-large-scale integrated circuit chips and used for completing certain Signal processing task, it is gradually developed for adapting to the need of high-speed real-time Signal processing task
An ASIC (Application Specific Integrated Circuit) refers to an Integrated Circuit designed and manufactured according to the requirements of a Specific user and the requirements of a Specific electronic system.
An FPGA (Field Programmable Gate Array) is a product of further development based on Programmable devices such as PAL (Programmable Array Logic) and GAL (general Array Logic). The circuit is a semi-custom circuit in the field of Application Specific Integrated Circuits (ASIC), not only overcomes the defects of the custom circuit, but also overcomes the defect that the number of gate circuits of the original programmable device is limited.
A general purpose processor, which may be a microprocessor or the processor may be any conventional processor or the like.
Other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like.
The present embodiments also provide a computer-readable storage medium for storing program code for performing a method for improving automatic assessment of machine translation quality using search as described above.
The storage medium stores program instructions capable of implementing all the methods described above, wherein the program instructions may be stored in the storage medium in the form of a software product, and include instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a mobile hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, or terminal devices, such as a computer, a server, a mobile phone, and a tablet.
While the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.

Claims (10)

1. The method for improving the automatic evaluation of the machine translation quality by utilizing retrieval is characterized by comprising the following steps:
and (3) a retrieval stage: for a machine translation quality evaluation sentence pair, searching a related parallel sentence pair for a word to be evaluated in the machine translation quality evaluation sentence pair in a database;
and (3) a machine translation quality evaluation stage: and encoding the retrieved parallel sentence pairs, and fusing the parallel sentence pairs into a machine translation quality evaluation model.
2. The method for improving automatic assessment of machine translation quality using search of claim 1, wherein said search phase comprises the steps of:
step 1, constructing a database by using parallel sentence pairs;
step 2, constructing a query sequence for the words to be evaluated in the machine translation quality evaluation sentence pair in the database for the machine translation quality evaluation sentence pair, and retrieving;
and 3, sequencing the retrieved parallel sentence pairs of the words to be evaluated in the machine translation quality evaluation sentence pairs, and reserving the required parallel sentence pairs.
3. The method of claim 1 or 2, wherein the database constructs an inverted index for each word in each parallel sentence pair for all parallel sentence pairs, i.e. when searching for any word in a parallel sentence pair, the parallel sentence pair can be retrieved.
4. The method for improving automatic assessment of machine translation quality using search of claim 1, wherein said machine translation quality assessment phase comprises the steps of:
step 4, coding the machine translation quality evaluation sentence pair and the parallel sentence pair corresponding to the word to be evaluated and retrieved by the machine translation quality evaluation sentence pair by using a cross-language pre-training model respectively, and obtaining the hidden layer representation of the word to be evaluated in the machine translation quality evaluation sentence pair and the hidden layer representation of the retrieved parallel sentence pair respectively;
step 5, splicing hidden layer representations obtained by the retrieved parallel sentence pairs corresponding to the word to be evaluated;
step 6, using the hidden layer representation of the words to be evaluated in the machine translation quality evaluation sentence to extract information from the hidden layer representation spliced in the step 5 by using multi-head attention;
and 7, fusing the hidden layer representation of the word to be evaluated in the machine translation quality evaluation sentence and the hidden layer representation extracted based on the information in the step 6 through a gate control mechanism, inputting the obtained final representation into a multilayer perceptron for classification, and obtaining the translation accuracy of the word to be evaluated.
5. Method for improving the automatic assessment of the quality of machine translation with the aid of a search according to claim 1 or 2, characterized in that said search phase comprises in particular the following steps:
step 11, the database constructs a search engine through a Lucene structure, the Lucene constructs an index by using an FST structure, an index path is shared by words with the same prefix, namely a parallel sentence pair is given, and the Lucene constructs an inverted index for each word of the parallel sentence pair comprising a source language and a target language;
step 12, assuming that the machine translation quality evaluation sentence pair is: x ═ X 1 ,...,x i ,...,x m ) The translation is:
Y=(y 1 ,...,y j ,...,y n ) Evaluating the word y to be evaluated in sentence pairs for machine translation quality j The query sequence is:
MUST(y j )∧SHOULD(x 1 )∧...∧SHOULD(x m )∧SHOULD(y 1 )∧...∧SHOULD(y n );
step 13, for the word to be evaluated, all the searched parallel sentence pairs are sequenced by using BM25, and top-k results in the word to be evaluated are reserved; the calculation formula of the BM25 is as follows:
Figure FDA0003621409410000021
wherein Q is a machine translation quality evaluation sentence pair to be evaluated, Q i Representing each word in the pair of machine translation quality assessment sentences, and d represents a parallel sentence pair in the retrieved database, R (q) i D) is used to calculate q i The frequency of occurrence in d is calculated as follows:
Figure FDA0003621409410000022
W i for calculating q i The frequency of occurrence in the whole database is calculated by the formula:
Figure FDA0003621409410000031
in the above formula, k is 1 、k 2 And b is a predefined co-ordination factor, N denotes q i Total number of occurrences in the database, f i Denotes q i Number of occurrences in d, qf i Represents q i The number of occurrences in Q, dl represents the length of Q, and avg _ dl represents the average sentence length in the database.
6. The method for improving automatic assessment of machine translation quality using search of claim 5, wherein said machine translation quality assessment phase comprises the following steps:
step 44: encoding the machine translation quality assessment sentence pair by XLMR to obtain:
Figure FDA0003621409410000032
for the word y to be evaluated j Extracting it in h MT Hidden layer representation of middle corresponding position
Figure FDA0003621409410000033
In addition, for the word y to be evaluated j The corresponding retrieved parallel sentence pairs are:
Figure FDA0003621409410000034
respectively reacting R with 1 To R k Encoding by XLMR according to the same mode, and respectively obtaining the hidden layer state of the last layer to obtain:
Figure FDA0003621409410000035
step 55: will be provided with
Figure FDA0003621409410000036
To
Figure FDA0003621409410000037
Are spliced together to obtain
Figure FDA0003621409410000038
And step 66: will be y described in step 44 j Corresponding hiddenLayer representation
Figure FDA0003621409410000039
To h R Extracting information by Multihead Attention to obtain h R-Extract Said
Figure FDA00036214094100000310
Step 77: will be provided with
Figure FDA00036214094100000311
And h R-Extract H is obtained by fusing through a gate control mechanism final H is to be final Inputting the input into a multi-layer perceptron for classification, and outputting OK/BAD, namely the word y to be evaluated j Correct or incorrect translation.
7. The method for improving automated machine translation quality assessment using search of claim 6, wherein said gating mechanism is:
Figure FDA0003621409410000041
the above-mentioned
Figure FDA0003621409410000042
8. The method of claim 7, wherein sentence pairs containing a currently evaluated word are not retrieved from parallel sentence pairs, and then h is the time h R-Extract The value is 0, the g value output by the gating mechanism model is also close to 0, namely the model does not use the retrieval result; after the fused representation is obtained, h is combined in the same manner as the baseline model final And inputting the data into a multi-layer perceptron for classification.
9. Electronic device for automatic assessment of machine translation quality using retrieval enhancement, characterized in that said electronic device (50) comprises a processor (51) and a memory (52):
the memory (52) is used for storing program codes and transmitting the program codes to the processor (51);
the processor (51) is configured to perform the method of any of claims 1-8 according to instructions in the program code.
10. A computer-readable storage medium for storing program code for performing the method for improving automatic assessment of machine translation quality using search according to any of claims 1 to 8.
CN202210460184.4A 2022-04-28 2022-04-28 Method, medium, and apparatus for improving automatic evaluation of machine translation quality using search Pending CN114896992A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210460184.4A CN114896992A (en) 2022-04-28 2022-04-28 Method, medium, and apparatus for improving automatic evaluation of machine translation quality using search

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210460184.4A CN114896992A (en) 2022-04-28 2022-04-28 Method, medium, and apparatus for improving automatic evaluation of machine translation quality using search

Publications (1)

Publication Number Publication Date
CN114896992A true CN114896992A (en) 2022-08-12

Family

ID=82718746

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210460184.4A Pending CN114896992A (en) 2022-04-28 2022-04-28 Method, medium, and apparatus for improving automatic evaluation of machine translation quality using search

Country Status (1)

Country Link
CN (1) CN114896992A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117910482A (en) * 2024-03-19 2024-04-19 江西师范大学 Automatic machine translation evaluation method based on depth difference characteristics
CN118468899A (en) * 2024-07-12 2024-08-09 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Translation method and device of machine translation large language model based on example perception

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117910482A (en) * 2024-03-19 2024-04-19 江西师范大学 Automatic machine translation evaluation method based on depth difference characteristics
CN117910482B (en) * 2024-03-19 2024-05-28 江西师范大学 Automatic machine translation evaluation method based on depth difference characteristics
CN118468899A (en) * 2024-07-12 2024-08-09 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Translation method and device of machine translation large language model based on example perception
CN118468899B (en) * 2024-07-12 2024-09-24 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Translation method and device of machine translation large language model based on example perception

Similar Documents

Publication Publication Date Title
Yang et al. Improving multilingual sentence embedding using bi-directional dual encoder with additive margin softmax
Tsai et al. Multimodal transformer for unaligned multimodal language sequences
CN110390103B (en) Automatic short text summarization method and system based on double encoders
Distiawan et al. GTR-LSTM: A triple encoder for sentence generation from RDF data
CN111680159B (en) Data processing method and device and electronic equipment
CN111221939B (en) Scoring method and device and electronic equipment
CN117076653B (en) Knowledge base question-answering method based on thinking chain and visual lifting context learning
CN114896992A (en) Method, medium, and apparatus for improving automatic evaluation of machine translation quality using search
US20160098645A1 (en) High-precision limited supervision relationship extractor
WO2022020467A1 (en) System and method for training multilingual machine translation evaluation models
CN105069103B (en) Method and system for APP search engine to utilize user comments
WO2021212801A1 (en) Evaluation object identification method and apparatus for e-commerce product, and storage medium
KR102426599B1 (en) Fake news detection server and method based on korean grammar transformation
Zhong et al. E3: Entailment-driven extracting and editing for conversational machine reading
CN113779996A (en) Standard entity text determination method and device based on BilSTM model and storage medium
Liu et al. Open intent discovery through unsupervised semantic clustering and dependency parsing
He et al. Dynamic Invariant‐Specific Representation Fusion Network for Multimodal Sentiment Analysis
CN117093686A (en) Intelligent question-answer matching method, device, terminal and storage medium
CN107958068B (en) Language model smoothing method based on entity knowledge base
Das et al. A multi-stage multimodal framework for sentiment analysis of Assamese in low resource setting
CN116541520A (en) Emotion analysis method and device, electronic equipment and storage medium
Wang et al. Named entity recognition method of brazilian legal text based on pre-training model
Chen et al. Towards Boosting LLMs-driven Relevance Modeling with Progressive Retrieved Behavior-augmented Prompting
CN117076608A (en) Script event prediction method and device for integrating external event knowledge based on text dynamic span
CN116503127A (en) Model training method, retrieval method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination