CN113033218B - Machine translation quality evaluation method based on neural network structure search - Google Patents

Machine translation quality evaluation method based on neural network structure search Download PDF

Info

Publication number
CN113033218B
CN113033218B CN202110414498.6A CN202110414498A CN113033218B CN 113033218 B CN113033218 B CN 113033218B CN 202110414498 A CN202110414498 A CN 202110414498A CN 113033218 B CN113033218 B CN 113033218B
Authority
CN
China
Prior art keywords
predictor
model
machine translation
search
fitness
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110414498.6A
Other languages
Chinese (zh)
Other versions
CN113033218A (en
Inventor
杜权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Yayi Network Technology Co ltd
Original Assignee
Shenyang Yayi Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Yayi Network Technology Co ltd filed Critical Shenyang Yayi Network Technology Co ltd
Priority to CN202110414498.6A priority Critical patent/CN113033218B/en
Publication of CN113033218A publication Critical patent/CN113033218A/en
Application granted granted Critical
Publication of CN113033218B publication Critical patent/CN113033218B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/51Translation evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/086Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming

Abstract

The invention discloses a machine translation quality evaluation method based on neural network structure search, which comprises the following steps: acquiring training data of a WMT quality assessment task and training data in a WMT machine translation task; determining a predictor component in a predictor-evaluator model, performing a pre-search using a search strategy based on an evolutionary algorithm; constructing a classical predictor-evaluator model, and carrying out initial use of a transducer neural machine translation model to hot start an initialized population by carrying out a search strategy based on an evolutionary algorithm; searching strategies based on an evolutionary algorithm are used for pre-searching; performing fine adjustment, training and optimizing on the network structure of the predictor part; the word-level task of quality assessment is performed using the complete model, and its accuracy over the test set is used to characterize the model performance. The invention utilizes network structure search techniques to tailor the network structure for the predictor component for quality assessment tasks and data characteristics.

Description

Machine translation quality evaluation method based on neural network structure search
Technical Field
The invention relates to a machine translation quality evaluation technology, in particular to a machine translation quality evaluation method based on neural network structure search.
Background
In recent years, with the widespread spread and use of deep learning techniques, neural network-based approaches have achieved remarkable success in many fields. The performance of neural network-based approaches on specific tasks often depends on the structure of the neural network, and thus most efforts of researchers have focused on designing more excellent network structures. With the continuous progress of research in various fields, more and more excellent neural network structures are proposed, and the neural network structures applied to various tasks become more and more complex, which means that the trial-and-error cost and the time cost of designing the neural network structures by means of manpower become more difficult to bear, and thus, the structure search technology is generated.
The structure search technology is a technology for automatically acquiring a neural network structure with better performance and stronger generalization capability by designing an economic and efficient search method under a given search space, and aims to relieve researchers from a large amount of mental labor. Currently, there are several mainstream methods for structure search technology: a gradient-based network structure search method, an evolutionary algorithm-based structure search method, a reinforcement learning-based structure search method, and a Bayesian optimization-based structure search method.
Translation quality assessment is an important area in machine translation that can make decisions about translation quality without relying on reference translations, including determining word errors, scoring sentences or documents, and so forth. The most classical structure to solve this task is the predictor-evaluator model, where the predictor network structure responsible for feature extraction is often complex. Due to the lack of quality assessment-related data, researchers often use either a trained translation model or various pre-trained models directly as predictors. The network structure of the evaluator is very simple, and a bidirectional RNN network is often directly adopted.
Since it cannot be guaranteed that the translation model and the pre-training model are sufficiently suitable for quality assessment tasks, the present invention will tailor the network structure for the predictor by means of neural network structure search techniques. The current network structure searching method is mostly applied to the tasks of image classification, language modeling and the like with lighter weight, because the realization of the neural network structure searching has extremely high requirements on the computing power of the equipment, and the lightweight network structure searching is more likely to be realized on the existing equipment. Similar to such tasks of quality assessment, the neural network structure itself is relatively complex, and there is a difficulty in applying the structure search technique to the task.
Disclosure of Invention
Aiming at the current situation that the network structure of a predictor component in the classical predictor-evaluator model of the existing quality evaluation task is not completely suitable for the quality evaluation task, the invention provides a machine translation quality evaluation method based on neural network structure search, and the network structure of the predictor component is searched by means of a network structure search technology, so that the model performance is further improved.
In order to realize the above, the technical scheme adopted by the invention is as follows:
the invention provides a machine translation quality evaluation method based on neural network structure search, which comprises the following steps:
1) Acquiring training data of a WMT quality assessment task and training data in a WMT machine translation task;
2) Determining a portion of the network structure search technique to be implemented as a predictor component in a predictor-evaluator model, determining a search space based on component structure and functional characteristics, and determining that a search strategy based on an evolutionary algorithm is to be used for pre-searching;
3) Constructing a classical predictor-evaluator model, wherein the structure of the evaluator part directly uses a bidirectional GRU model in the traditional model, the predictor part is constructed according to a search space and a search strategy, and a transformation neural machine translation model is used for carrying out initial use of the search strategy based on an evolutionary algorithm to thermally start an initialization population;
4) Taking neural machine translation as a target task, taking machine translation bilingual data as training data, and performing pre-search by using a search strategy based on an evolutionary algorithm;
5) Fine-tuning the network structure of the predictor part by using the data of the WMT quality assessment task;
6) Training and optimizing the searched predictor by using machine translation bilingual training data, and continuously training and optimizing a predictor-evaluator overall model by using data of a WMT quality evaluation task after convergence until convergence;
7) And performing word-level tasks of quality assessment by using the complete model after training convergence, and using the accuracy of the word-level tasks on the test set to characterize the performance of the model.
Step 2) selecting a structural space near a transducer model as a search space, modifying the search space on the basis of a NASNET search space, wherein the search space consists of two groups of identical stackable computing units which respectively represent an encoder and a decoder, the computing units of different parts are cascaded by blocks of different numbers of NASNet patterns, each block comprises a left branch and a right branch, and the left branch and the right branch respectively receive two hidden state inputs and generate new hidden state combinations to be used as the output of the block; in the structure searching process, the operation combination of the left branch and the right branch is actually required to be searched, wherein the operation combination comprises input, normalization, layer structure, output dimension, activation function, combination function and calculation unit quantity; and meanwhile, determining a network structure of a search predictor in a search space by using a search strategy based on an evolution algorithm, namely, regarding all candidate structures as a population in the biological world, wherein each candidate structure is an individual in the population, the 'win/lose' in the population evolution process is the process of selecting the candidate structure, and the 'good/bad' of the individual is measured by the adaptability of the individual.
Step 3) building a predictor-evaluator model, searching the internal structure of the predictor by using a network structure searching technology, and keeping the internal structure of the evaluator part to be a classical bidirectional circulating neural network, in particular a bidirectional GRU network; in the search process for predictor structures, a population is initially initialized using a transducer neural machine translation model hot start, on the basis of which more excellent predictor structures than existing transducer models are found.
In the step 4), under the inspired of the pre-training method, taking a neural machine translation task as a target task, and pre-searching the network structure of the predictor component on the basis of fully utilizing machine translation bilingual data; the evolution algorithm adopted in the pre-search process is a progressive dynamic barrier algorithm based on a tournament selection evolution algorithm, and specifically comprises the following steps:
401 Randomly sampling N individuals in an original population obtained by initializing a Transformer model to serve as a sub-population, evaluating the loss of each individual in the sub-population on a check set to serve as fitness, and selecting the individual with the highest fitness to mutate, namely changing some components in a network model into other components to generate m sub-models;
402 Training s) the m submodels generated in 401) 0 After the step, the fitness of m sub-models is evaluated, and the average value h of the fitness of the whole population at the moment is calculated 0
403 At this time)Randomly sampling N individuals in the population again to serve as sub-populations, evaluating the fitness of each individual in the sub-populations, and selecting the individuals with the highest fitness for mutation to generate m sub-models; training the m submodels s 0 After the step, the fitness of the m sub-models is evaluated, for which the fitness is greater than h 0 The submodel of the model continues to train s 1 After the step, the fitness of the submodels is evaluated, and the average value h of the fitness of the whole population at the moment is calculated 1
404 Randomly sampling N individuals in the population at the moment again to serve as sub-populations, evaluating the fitness of each individual in the sub-populations, and selecting the individuals with the highest fitness to mutate to generate m sub-models. Training the m submodels s 0 After the step, the fitness of the m sub-models is evaluated, for which the fitness is greater than h 0 The submodel of the model continues to train s 1 After the step, the fitness of these submodels is again evaluated, for which fitness is greater than h 1 The submodel of the model continues to train s 2 After the step, the fitness of the submodels is evaluated, and the average value h of the fitness of the whole population at the moment is calculated 2
405 And the same is done until the training steps of all individuals in the population reach a specified value.
In step 5) the structure searched in step 4) is used as a predictor, and the training data of the quality assessment task is used to carry out fine tuning on the structure of the predictor by using the evolution algorithm mentioned in step 4) on the whole predictor-evaluator model.
In step 6), after the model parameters of the predictor and evaluator components are re-initialized, training the model parameters of the predictor by using bilingual data from a machine translation task and a neural machine translation task until convergence; and then training and optimizing model parameters of the predictor and the evaluator simultaneously by using the quality evaluation data until convergence, wherein the method comprises the following steps of:
601 Building a predictor assembly by using the network structure obtained after the fine tuning in the step 5);
602 After initializing the parameters in the predictor component again, using bilingual data from the machine translation task, and simultaneously taking neural machine translation as a target task, and performing conventional training on the parameters of the predictor component until the loss of the model is converged;
603 Using a bi-directional GRU network to build an evaluator assembly in combination with a pre-trained predictor assembly;
604 The parameters in the assessment are re-initialized, and model parameters of the predictor and the evaluator are simultaneously trained by using the quality assessment training data until the loss of the model on the training set is converged.
The invention has the following beneficial effects and advantages:
1. the method eliminates the method of directly using a translation model or a pre-training model as a predictor in the traditional method, and utilizes a network structure searching technology to customize a network structure for the predictor component according to the task and data characteristics of quality assessment.
2. The invention overcomes the defect of overlong training time in the network structure searching process, and adopts a progressive dynamic obstacle method to stop the training of some models without prospect in advance, so as to allocate more computing resources to the sub-models with better current performance.
3. The invention realizes the performance improvement on the quality evaluation task through the automatic design of the model structure.
Drawings
FIG. 1 is a diagram of a search space involved in a network structure search process in accordance with the present invention;
FIG. 2 is a schematic diagram of an overall model of a predictor-evaluator in accordance with the present invention;
fig. 3 is a schematic diagram of a network structure pre-search and fine tuning process according to the present invention.
Detailed Description
The invention is further elucidated below in connection with the drawings of the specification.
The invention relates to a machine translation quality evaluation method based on neural network structure search, which mainly searches the network structure of a predictor component of a classical predictor-evaluator model, and an evaluator part in the searching process uses a bidirectional long-short-term memory network. Because of the lack of quality assessment task training data, neural machine translation is used as a target task, the network structure of the predictor component is pre-searched by using bilingual data of machine translation, and then the pre-searched network structure of the predictor component is finely tuned by using the quality assessment training data. The entire search process will use a search strategy based on an evolutionary algorithm.
The invention discloses a machine translation quality evaluation method based on neural network structure search, which comprises the following steps:
1) Acquiring the Ind training data of a WMT2020 quality assessment task and the Ind training data of a WMT2014 machine translation task;
2) Determining the part to be implemented with the network structure search technique as a predictor-evaluating predictor components in a model thereof, determining a search space according to component structure and functional characteristics, and simultaneously determining that a search strategy based on an evolutionary algorithm is to be used for pre-searching;
3) Constructing a classical predictor-evaluator model, wherein the structure of the evaluator part directly uses a bidirectional GRU model in the traditional model, the predictor part is constructed according to a search space and a search strategy, and a transformation neural machine translation model is used for carrying out initial use of the search strategy based on an evolutionary algorithm to thermally start an initialization population;
4) Taking neural machine translation as a target task, taking machine translation bilingual data as training data, and performing pre-search by using a search strategy based on an evolutionary algorithm;
5) Fine tuning the network structure of the predictor part by using the quality evaluation data with labels;
6) Training and optimizing the searched predictor by using machine translation bilingual training data, and continuously training and optimizing the predictor-evaluator overall model by using quality evaluation training data after convergence until convergence;
7) And performing word-level tasks of quality assessment by using the complete model after training convergence, and using the accuracy of the word-level tasks on the test set to characterize the performance of the model.
In step 2) the structure space around the transducer model is empirically selected as the search space, more specifically, a slight modification is made on the basis of the NASNet search space, as shown in fig. 1, the search space is composed of two identical and stackable computing units, which respectively represent the encoder and the decoder, different parts of the computing units are cascaded by different numbers of NASNet style blocks, and each block comprises a left branch and a right branch, which respectively receive two hidden state inputs and generate new hidden state combinations as the output of the block. In the structure searching process, the operation combination of the left branch and the right branch is actually required to be searched, wherein the operation combination comprises input, normalization, layer structure, output dimension, activation function, combination function and calculation unit quantity; and simultaneously determining a network structure of a predictor searched in a search space by using a search strategy based on an evolution algorithm, namely, taking all candidate structures as a population in the biological world, wherein each candidate structure is an individual in the population, the 'win/lose' in the population evolution process is the process of selecting the candidate structure, and the 'win/lose' of the individual is measured by the adaptability (the loss of the candidate structure on a check set after a certain number of steps are trained on the candidate structure).
In step 3), a predictor-evaluator model is built, the model framework of which is shown in fig. 2, and since the predictor responsible for feature extraction in the model has the greatest influence on the performance of the whole model, the internal structure of the predictor is mainly searched by using a network structure searching technology, so that the internal structure of the evaluator part is kept to be a classical bidirectional circulating neural network, in particular a bidirectional LSTM network. In the search process for predictor structures, a population is initially initialized using a transducer neural machine translation model hot start, on the basis of which more excellent predictor structures than existing transducer models are found.
In step 4), since the scarce quality evaluation data is insufficient to complete the network structure searching process, under the heuristic of the pretraining method, the neural machine translation task is taken as the target task, and on the basis of fully utilizing the machine translation bilingual data, the network structure of the predictor component is pre-searched, and the process corresponds to the pre-searching process of the left part of fig. 3; the evolution algorithm adopted in the pre-search process is a method for dynamically distributing resources to a more promising network structure according to suitability on the basis of the tournament selection evolution algorithm, namely a progressive dynamic barrier algorithm, which is specifically as follows:
401 Randomly sampling N individuals in the original population initialized by the Transformer model as a sub-population, evaluating the loss of each individual in the sub-population on the check set as fitness, and selecting the individual with the highest fitness for mutation, namely changing some components in the network model into other components, to generate m sub-models.
402 Training s) the m submodels generated in 401) 0 After the step, their fitness is evaluated, and the average value h of the fitness of the whole population at that time is calculated 0
403 Randomly sampling N individuals in the population at the moment again to serve as sub-populations, evaluating the fitness of each individual in the sub-populations, and selecting the individuals with the highest fitness to mutate to generate m sub-models. Training the m submodels s 0 After the step, their fitness is evaluated, for which fitness is greater than h 0 The submodel of the model continues to train s 1 After the step, their fitness is evaluated, and the average value h of the fitness of the whole population at that time is calculated 1
404 Randomly sampling N individuals in the population at the moment again to serve as sub-populations, evaluating the fitness of each individual in the sub-populations, and selecting the individuals with the highest fitness to mutate to generate m sub-models. Training the m submodels s 0 After the step, their fitness is evaluated, for which fitness is greater than h 0 The submodel of the model continues to train s 1 After the step, their fitness is again assessed, for which fitness is greater than h 1 The submodel of the model continues to train s 2 After the step, their fitness is evaluated, and the average value h of the fitness of the whole population at that time is calculated 2
405 And the same is done until the training steps of all individuals in the population reach a specified value.
In step 5) the structure searched in step 4) is used as predictor, and the evolution algorithm mentioned in step 4) is continued to be used on the whole predictor-evaluator model using the training data of the quality evaluation task to fine-tune the structure of the predictor, which corresponds to the "fine-tuning" procedure in the right part of fig. 3.
In step 6), after re-initializing the model parameters of the predictor and evaluator components, training the model parameters of the predictor using bilingual data from the machine translation task and the neural machine translation task until convergence; the model parameters of the predictor and evaluator are then simultaneously trained and optimized using the quality assessment data until convergence. The method comprises the following steps:
601 Building a predictor assembly by using the network structure obtained after the fine tuning in the step 5);
602 After re-initializing the parameters in the predictor component, using bilingual data from the machine translation task while taking neural machine translation as the target task, and performing conventional training on the parameters of the predictor component until the loss of the model converges.
603 Using a bi-directional GRU network to build an evaluator assembly in combination with a pre-trained predictor assembly;
604 The parameters in the assessment are re-initialized, and model parameters of the predictor and the evaluator are simultaneously trained by using the quality assessment training data until the loss of the model on the training set is converged.
The word-level quality assessment task is presented here as an example. When the test data is "{" Draw or select a line "," zeichen oderWhen Sie ine line ies "}" (where "Draw or select a line" is an english source language sentence, "zeichen oder>Sie tein link aus "is a german translation provided by the translation system), word-level quality assessmentThe estimation task requires the positions of words in the source language sentence which cause the translation error, words in the translation which are translated error, and the phenomenon of miss-translation when translating the sentence. The two sentences in the test data are respectively sent to the Source language end and the translation end in the model of fig. 2, the predictor component will extract high-abstract quality features in the Source language and the translation, which can reflect the relation between the Source language and the translation and the quality of the translation, and send the quality features to the evaluator, the evaluator will make predictions according to the quality features, and the output of the part includes three tags, namely Source tags (reflecting whether the words in the Source language sentence are correctly translated), MT tags (reflecting whether the translated words in the translation are correct), gap tags (reflecting whether the translation corresponding position in the translation is miss-translated), and in this example, the Source tags are "BAD BAD OK BAD BAD OK", the MT tags are "OK OK OK OK OK BAD OK OK", and the Gap tags are "OK BAD OK OK OK OK OK OK OK". />

Claims (6)

1. A machine translation quality evaluation method based on neural network structure search is characterized by comprising the following steps:
1) Acquiring training data of a WMT quality assessment task and training data in a WMT machine translation task;
2) Determining a portion of the network structure search technique to be implemented as a predictor component in a predictor-evaluator model, determining a search space based on component structure and functional characteristics, and determining that a search strategy based on an evolutionary algorithm is to be used for pre-searching;
3) Constructing a classical predictor-evaluator model, wherein the structure of the evaluator part directly uses a bidirectional GRU model in the traditional model, the predictor part is constructed according to a search space and a search strategy, and a transformation neural machine translation model is used for carrying out initial use of the search strategy based on an evolutionary algorithm to thermally start an initialization population;
4) Taking neural machine translation as a target task, taking machine translation bilingual data as training data, and pre-searching a network structure of a predictor part by using a search strategy based on an evolutionary algorithm;
5) Fine-tuning the network structure of the predictor part by using the data of the WMT quality assessment task;
6) Training and optimizing the searched predictor by using machine translation bilingual training data, and continuously training and optimizing a predictor-evaluator overall model by using data of a WMT quality evaluation task after convergence until convergence;
7) And performing word-level tasks of quality assessment by using the complete model after training convergence, and using the accuracy of the word-level tasks on the test set to characterize the performance of the model.
2. The machine translation quality evaluation method based on neural network structure search according to claim 1, wherein: step 2) selecting a structural space near a transducer model as a search space, modifying the search space on the basis of a NASNET search space, wherein the search space consists of two groups of identical stackable computing units which respectively represent an encoder and a decoder, the computing units of different parts are cascaded by blocks of different numbers of NASNet patterns, each block comprises a left branch and a right branch, and the left branch and the right branch respectively receive two hidden state inputs and generate new hidden state combinations to be used as the output of the block; in the structure searching process, the operation combination of the left branch and the right branch is actually required to be searched, wherein the operation combination comprises input, normalization, layer structure, output dimension, activation function, combination function and calculation unit quantity; and meanwhile, determining a network structure of a search predictor in a search space by using a search strategy based on an evolution algorithm, namely, regarding all candidate structures as a population in the biological world, wherein each candidate structure is an individual in the population, the 'win/lose' in the population evolution process is the process of selecting the candidate structure, and the 'good/bad' of the individual is measured by the adaptability of the individual.
3. The machine translation quality evaluation method based on neural network structure search according to claim 1, wherein: step 3) building a predictor-evaluator model, searching the internal structure of the predictor by using a network structure searching technology, and keeping the internal structure of the evaluator part to be a classical bidirectional circulating neural network, in particular a bidirectional GRU network; in the search process for predictor structures, a population is initially initialized using a transducer neural machine translation model hot start, on the basis of which more excellent predictor structures than existing transducer models are found.
4. The machine translation quality evaluation method based on neural network structure search according to claim 1, wherein: in the step 4), under the inspired of the pre-training method, taking a neural machine translation task as a target task, and pre-searching the network structure of the predictor component on the basis of fully utilizing machine translation bilingual data; the evolution algorithm adopted in the pre-search process is a progressive dynamic barrier algorithm based on a tournament selection evolution algorithm, and specifically comprises the following steps:
401 Randomly sampling N individuals in an original population obtained by initializing a Transformer model to serve as a sub-population, evaluating the loss of each individual in the sub-population on a check set to serve as fitness, and selecting the individual with the highest fitness to mutate, namely changing some components in a network model into other components to generate m sub-models;
402 Training s) the m submodels generated in 401) 0 After the step, the fitness of m sub-models is evaluated, and the average value h of the fitness of the whole population at the moment is calculated 0
403 Randomly sampling N individuals in the population at the moment again to serve as sub-populations, evaluating the fitness of each individual in the sub-populations, and selecting the individuals with the highest fitness to mutate to generate m sub-models; training the m submodels s 0 After the step, the fitness of the m sub-models is evaluated, for which the fitness is greater than h 0 The submodel of the model continues to train s 1 After the step, the fitness of the submodels is evaluated, and the average value h of the fitness of the whole population at the moment is calculated 1
404 Randomly sampling N individuals in the population at the moment again to serve as sub-populations, evaluating the fitness of each individual in the sub-populations, and selecting the individuals with the highest fitness to mutate to generate m sub-models; training the m sub-modelsTraining s 0 After the step, the fitness of the m sub-models is evaluated, for which the fitness is greater than h 0 The submodel of the model continues to train s 1 After the step, the fitness of these submodels is again evaluated, for which fitness is greater than h 1 The submodel of the model continues to train s 2 After the step, the fitness of the submodels is evaluated, and the average value h of the fitness of the whole population at the moment is calculated 2
405 And the same is done until the training steps of all individuals in the population reach a specified value.
5. The machine translation quality evaluation method based on neural network structure search according to claim 1, wherein: in step 5) the structure searched in step 4) is used as a predictor, and the training data of the quality assessment task is used to carry out fine tuning on the structure of the predictor by using the evolution algorithm mentioned in step 4) on the whole predictor-evaluator model.
6. The machine translation quality evaluation method based on neural network structure search according to claim 1, wherein: in step 6), after the model parameters of the predictor and evaluator components are re-initialized, training the model parameters of the predictor by using bilingual data from a machine translation task and a neural machine translation task until convergence; and then training and optimizing model parameters of the predictor and the evaluator simultaneously by using the quality evaluation data until convergence, wherein the method comprises the following steps of:
601 Building a predictor assembly by using the network structure obtained after the fine tuning in the step 5);
602 After initializing the parameters in the predictor component again, using bilingual data from the machine translation task, and simultaneously taking neural machine translation as a target task, and performing conventional training on the parameters of the predictor component until the loss of the model is converged;
603 Using a bi-directional GRU network to build an evaluator assembly in combination with a pre-trained predictor assembly;
604 The parameters in the assessment are re-initialized, and model parameters of the predictor and the evaluator are simultaneously trained by using the quality assessment training data until the loss of the model on the training set is converged.
CN202110414498.6A 2021-04-16 2021-04-16 Machine translation quality evaluation method based on neural network structure search Active CN113033218B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110414498.6A CN113033218B (en) 2021-04-16 2021-04-16 Machine translation quality evaluation method based on neural network structure search

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110414498.6A CN113033218B (en) 2021-04-16 2021-04-16 Machine translation quality evaluation method based on neural network structure search

Publications (2)

Publication Number Publication Date
CN113033218A CN113033218A (en) 2021-06-25
CN113033218B true CN113033218B (en) 2023-08-15

Family

ID=76457394

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110414498.6A Active CN113033218B (en) 2021-04-16 2021-04-16 Machine translation quality evaluation method based on neural network structure search

Country Status (1)

Country Link
CN (1) CN113033218B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113515960B (en) * 2021-07-14 2024-04-02 厦门大学 Automatic translation quality assessment method integrating syntax information

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108829684A (en) * 2018-05-07 2018-11-16 内蒙古工业大学 A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy
CN109074242A (en) * 2016-05-06 2018-12-21 电子湾有限公司 Metamessage is used in neural machine translation
CN110598224A (en) * 2019-09-23 2019-12-20 腾讯科技(深圳)有限公司 Translation model training method, text processing device and storage medium
CN111078833A (en) * 2019-12-03 2020-04-28 哈尔滨工程大学 Text classification method based on neural network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11138392B2 (en) * 2018-07-26 2021-10-05 Google Llc Machine translation using neural network models

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109074242A (en) * 2016-05-06 2018-12-21 电子湾有限公司 Metamessage is used in neural machine translation
CN108829684A (en) * 2018-05-07 2018-11-16 内蒙古工业大学 A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy
CN110598224A (en) * 2019-09-23 2019-12-20 腾讯科技(深圳)有限公司 Translation model training method, text processing device and storage medium
CN111078833A (en) * 2019-12-03 2020-04-28 哈尔滨工程大学 Text classification method based on neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于多语言预训练语言模型的译文质量估计方法;陆金梁;张家俊;;厦门大学学报(自然科学版)(第02期);全文 *

Also Published As

Publication number Publication date
CN113033218A (en) 2021-06-25

Similar Documents

Publication Publication Date Title
CN111310438B (en) Chinese sentence semantic intelligent matching method and device based on multi-granularity fusion model
CN106484682B (en) Machine translation method, device and electronic equipment based on statistics
CN109614471B (en) Open type problem automatic generation method based on generation type countermeasure network
CN110457675A (en) Prediction model training method, device, storage medium and computer equipment
CN108829684A (en) A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy
CN110851566B (en) Differentiable network structure searching method applied to named entity recognition
KR102027141B1 (en) A program coding system based on artificial intelligence through voice recognition and a method thereof
CN110033008B (en) Image description generation method based on modal transformation and text induction
CN113158875B (en) Image-text emotion analysis method and system based on multi-mode interaction fusion network
Ding et al. Research on using genetic algorithms to optimize Elman neural networks
CN109032375A (en) Candidate text sort method, device, equipment and storage medium
CN112015868B (en) Question-answering method based on knowledge graph completion
CN112000772B (en) Sentence-to-semantic matching method based on semantic feature cube and oriented to intelligent question and answer
CN111816169B (en) Method and device for training Chinese and English hybrid speech recognition model
CN111191785B (en) Structure searching method based on expansion search space for named entity recognition
CN109918663A (en) A kind of semantic matching method, device and storage medium
CN108932232A (en) A kind of illiteracy Chinese inter-translation method based on LSTM neural network
CN111368142B (en) Video intensive event description method based on generation countermeasure network
CN113255366B (en) Aspect-level text emotion analysis method based on heterogeneous graph neural network
CN110807069B (en) Entity relationship joint extraction model construction method based on reinforcement learning algorithm
CN110334196B (en) Neural network Chinese problem generation system based on strokes and self-attention mechanism
CN114398976A (en) Machine reading understanding method based on BERT and gate control type attention enhancement network
CN113033218B (en) Machine translation quality evaluation method based on neural network structure search
Zhao et al. Synchronously improving multi-user English translation ability by using AI
CN112765996A (en) Middle-heading machine translation method based on reinforcement learning and machine translation quality evaluation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant