WO2024032691A1 - Machine translation quality assessment method and apparatus, device, and storage medium - Google Patents

Machine translation quality assessment method and apparatus, device, and storage medium Download PDF

Info

Publication number
WO2024032691A1
WO2024032691A1 PCT/CN2023/112135 CN2023112135W WO2024032691A1 WO 2024032691 A1 WO2024032691 A1 WO 2024032691A1 CN 2023112135 W CN2023112135 W CN 2023112135W WO 2024032691 A1 WO2024032691 A1 WO 2024032691A1
Authority
WO
WIPO (PCT)
Prior art keywords
evaluation
language
target
source
text
Prior art date
Application number
PCT/CN2023/112135
Other languages
French (fr)
Chinese (zh)
Inventor
陶大程
丁亮
陆清屿
Original Assignee
京东科技信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东科技信息技术有限公司 filed Critical 京东科技信息技术有限公司
Publication of WO2024032691A1 publication Critical patent/WO2024032691A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/51Translation evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation

Definitions

  • the embodiments of this application relate to the technical field of natural language processing, for example, to a machine translation quality assessment method, device, equipment and storage medium.
  • the quality of the translated text can be assessed based on the translation quality assessment model. For example, an evaluation model trained based on sentence-level annotation data evaluates the quality of translated text, and the obtained index evaluation results tend to characterize the overall fluency of the translated text. Alternatively, the evaluation model obtained by training based on word-level annotation data is used to evaluate the quality of the translated text, and the obtained index evaluation results are biased towards characterizing the fidelity of the translated text.
  • each translation quality evaluation model can only be biased to evaluate the quality of a single level of the translated text, such as the overall fluency or fidelity of the translated text. It cannot comprehensively evaluate the translation quality, and the translation of different language pairs cannot be comprehensively evaluated.
  • the texts all use the same evaluation method, which leads to large differences in the accuracy of translation evaluations for different language pairs.
  • Embodiments of the present application provide a machine translation quality assessment method, device, equipment and storage medium to comprehensively assess translation quality and ensure the accuracy of translation assessment for different language pairs.
  • embodiments of the present application provide a machine translation quality assessment method, including:
  • the translation text pair includes a source text corresponding to the source language and a translated target text corresponding to the target language;
  • At least two evaluation results are fused to determine the target evaluation result of the translated text pair.
  • embodiments of the present application also provide a machine translation quality assessment device, including:
  • the translation text pair acquisition module is configured to acquire a translation text pair to be evaluated, where the translation text pair includes a source text corresponding to the source language and a translated target text corresponding to the target language;
  • An evaluation result determination module is configured to perform a quality evaluation on the target text based on at least two quality evaluation indicators and the source text, and determine the evaluation results corresponding to each of the quality evaluation indicators;
  • An evaluation weight determination module is configured to determine the evaluation weight corresponding to each of the quality evaluation indicators based on the language similarity between the source language and the target language;
  • the evaluation result fusion module is configured to perform fusion processing on at least two evaluation results based on at least two evaluation weights, and determine the target evaluation result of the translated text pair.
  • embodiments of the present application further provide an electronic device, where the electronic device includes:
  • a memory configured to store at least one program
  • the at least one processor When the at least one program is executed by the at least one processor, the at least one processor is caused to implement the machine translation quality evaluation method as provided in any embodiment of the present application.
  • embodiments of the present application also provide a computer-readable storage medium on which a computer program is stored.
  • the program is executed by a processor, the machine translation quality assessment method as provided in any embodiment of the present application is implemented.
  • Figure 1 is a flow chart of a machine translation quality assessment method provided by an embodiment of the present application
  • Figure 2 is a flow chart of another machine translation quality assessment method provided by an embodiment of the present application.
  • Figure 3 is a schematic structural diagram of a machine translation quality assessment device provided by an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • Figure 1 is a flow chart of a machine translation quality assessment method provided by an embodiment of the present application. This embodiment can be applied to the situation of evaluating the quality of text translated by a machine translation model.
  • the method can be performed by a machine translation quality assessment device, which can be implemented in software and/or hardware and integrated into an electronic device. As shown in Figure 1, the method includes the following steps:
  • the translation text pair includes a source text corresponding to the source language and a translated target text corresponding to the target language.
  • the source language may refer to the language to be translated.
  • the target language refers to the translated language.
  • the source text may refer to the original text expressed in the source language, that is, the sentence to be translated.
  • the target text can refer to the translation that expresses the same meaning as the source text in the target language, that is, the translated sentence.
  • the source text can be input into the machine translation model for translation, and the target text output by the machine translation model can be obtained, thereby obtaining the translated text pair to be evaluated.
  • the quality evaluation index may be an index used to evaluate the translation quality of the target text.
  • Different types of quality assessment indicators tend to evaluate different levels of translation.
  • the quality evaluation index may be, but is not limited to: a fluency evaluation index that is biased to evaluate the overall fluency of the target text, or a fidelity evaluation index that is biased to evaluate the fidelity of the target text.
  • the fluency evaluation index can be used to characterize the overall fluency of the translated text and whether it conforms to the presentation habits and other information.
  • Fidelity evaluation metrics can be used to It indicates whether the details in the translated text faithfully reflect the meaning of the original text, that is, it evaluates details such as mistranslations, omissions, and emotional errors in the translation.
  • Each quality evaluation index can correspond to at least one quality evaluation model, so that the evaluation result corresponding to the quality evaluation index is determined using at least one quality evaluation model.
  • This embodiment can use a scoring method to indicate the evaluation results. For example, the larger the score in the evaluation result, the higher the quality level corresponding to the quality evaluation indicator, for example, the higher the fluency or the higher the fidelity.
  • At least two different quality assessment indicators can be selected based on business needs. For each quality assessment indicator, the quality of the target text can be assessed based on at least one quality assessment model corresponding to the quality assessment indicator, and the quality assessment index can be determined. Output the evaluation results corresponding to this quality evaluation indicator.
  • this kind of quality assessment indicator corresponds to multiple quality assessment models
  • the average evaluation results corresponding to the evaluation indicators are used to improve the accuracy of quality evaluation.
  • language similarity can refer to the linguistic similarity between the source language and the target language in terms of language family, vocabulary and grammatical structure.
  • the language similarity between each two languages can be determined in advance, so that the language similarity between the source language and the target language can be directly obtained, or the language similarity between the source language and the target language can be determined in real time. .
  • the optimal evaluation weights corresponding to different quality evaluation indicators can be determined. That is, different evaluation weights can be determined based on different language pairs, so that the differences between different language pairs can be taken into account. This method can effectively avoid the situation where the accuracy of translation evaluation of different language pairs is greatly different, thereby ensuring the accuracy of translation evaluation of different language pairs and improving the versatility of quality translation evaluation.
  • the evaluation results corresponding to each quality evaluation index and the corresponding evaluation weights can be multiplied, and the multiplication results can be added together, and the weighted average result obtained can be used as the target evaluation result, so that at least two types of evaluation results can be fused.
  • the evaluation results corresponding to the quality evaluation indicators comprehensively evaluate the translation quality, avoiding the bias caused by a single evaluation indicator in evaluating the translated text, thereby improving the accuracy and robustness of the quality assessment.
  • the technical solution of this embodiment is to perform quality assessment on the target text based on at least two quality assessment indicators and the source text in the translated text pair to be assessed, determine the assessment results corresponding to each quality assessment indicator, and evaluate the target text based on the source language and the source text.
  • the language similarity between the target languages determines the evaluation weight corresponding to each quality evaluation index, so as to fuse at least two evaluation results based on at least two evaluation weights to determine the target evaluation result of the translated text pair, so that different evaluation results can be combined
  • the evaluation results corresponding to at least two quality evaluation indicators are fused to comprehensively evaluate the translation quality to avoid biased evaluation results, and each evaluation weight is determined based on the language similarity between the source language and the target language, so that it can be considered It can effectively avoid the situation where the translation evaluation accuracy of different language pairs differs greatly, thereby ensuring the accuracy of translation evaluation of different language pairs.
  • S130 may include: inputting the language similarity between the source language and the target language into a preset network model.
  • the preset network model is obtained by pre-training the data and label evaluation results based on translation samples. ; Based on the output of the preset network model, determine the evaluation weight corresponding to each quality evaluation indicator.
  • the preset network model can be set to represent the mapping relationship between the optimal evaluation weight corresponding to each quality evaluation index and the language similarity.
  • This mapping relationship can be obtained by learning the data and label evaluation results based on translation samples. For example, quality assessment can be performed on the target sample text in the translation sample pair data based on at least two quality assessment indicators and the source sample text in the translation sample pair data, and at least two sample evaluation results corresponding to each translation sample pair data can be obtained.
  • the network architecture of the preset network model can be set based on business requirements.
  • the preset network model can directly output the evaluation weight corresponding to each quality evaluation indicator, or it can only output the evaluation weight corresponding to one quality evaluation indicator, and determine the evaluation weights corresponding to other quality evaluation indicators based on the output evaluation weight. . For example, if there are two quality evaluation indicators A and B, and the preset network model is set to output the evaluation weight corresponding to the quality evaluation index A, then since the sum of the evaluation weights corresponding to A and B is 1, 1 and The difference between the evaluation weights corresponding to indicator A is determined as the evaluation weight corresponding to indicator B.
  • it may also include: based on the preset multilingual model, according to the source corpus corresponding to the source language and the target corpus corresponding to the target language, determining the source language representation vector and the target language corresponding to the source language. The corresponding target language representation vector; based on the source language representation vector and the target language representation vector, determine the language similarity between the source language and the target language.
  • the preset multilingual model may be a model that performs language processing on texts in different languages.
  • the preset multilingual model may be but is not limited to the XLM-RoBERTa model.
  • the source language representation vector vi used to represent the source linguistics can be determined based on the preset multilingual model and the source corpus.
  • the target language representation vector v j used to represent the target linguistics can be determined based on the preset multilingual model and the target corpus.
  • the cosine distance cos(v i , v j ) between the source language representation vector vi and the target language representation vector v j can be determined as the language similarity between the source language and the target language.
  • the source language representation vector corresponding to the source language and the target corresponding to the target language are determined.
  • Language representation vector can include:
  • each source text in the source corpus corresponding to the source language into the preset multilingual model determine the source language representation vector corresponding to each source text, and determine the source language corresponding to the source language based on multiple source language representation vectors Representation vector; input each target text in the target corpus corresponding to the target language into the preset multilingual model, determine the target language representation vector corresponding to each target text, and determine the target language correspondence based on multiple target language representation vectors The target language representation vector.
  • determining the source language representation vector corresponding to the source language may include: averaging the multiple source language representation vectors, and the obtained average vector is determined as the source language representation vector corresponding to the source language.
  • determining the target language representation vector corresponding to the target language may include: averaging the multiple target language representation vectors, and the obtained average vector is determined as the target language representation vector corresponding to the target language. .
  • each source text in the source corpus can be input into a preset multilingual model obtained by pre-training, and based on the output of the preset multilingual model, the source language representation vector R ( x im ), where i represents the source language and m represents the m-th source text.
  • the average vector obtained is determined as the source language representation vector v i , that is Among them, n i represents the number of source texts.
  • the target language representation vector R(x jm ) corresponding to each target text can be determined based on the preset multilingual model, where j represents the target language.
  • the average vector obtained is determined as the target language representation vector v j , that is Among them, n j represents the number of target texts.
  • Figure 2 is a flow chart of another machine translation quality evaluation method provided by an embodiment of the present application. This embodiment is based on the above embodiments and when the quality evaluation indicators include: fluency evaluation indicators and fidelity evaluation indicators, The entire evaluation process of translation quality is described in detail. The explanations of terms that are the same as or corresponding to the above embodiments will not be repeated here.
  • FIG. 1 Another machine translation quality assessment method provided in this embodiment includes the following steps:
  • the translation text pair includes a source text corresponding to the source language and a translated target text corresponding to the target language.
  • the fluency evaluation index can be used to characterize the overall fluency of the translated text and whether it conforms to the presentation habits and other information.
  • the preset fluency evaluation model may be an evaluation model that is biased to evaluate the overall fluency of the target text, so as to obtain evaluation results corresponding to the fluency evaluation index.
  • Preset fluency assessment models may include but are not limited to: COMET-MQM (Multidimensional Quality Metric) cross-language multidimensional quality model, COMET-QE cross-language quality assessment model and BLEURT (Bilingual Evaluation Understudy with Representations from Transformers) bilingual evaluation alternative model of at least one.
  • COMET Crossslingual Optimized Metric for Evaluation of Translation
  • COMET Cross-language multidimensional quality model
  • BLEURT Bolingual Evaluation Understudy with Representations from Transformers
  • COMET is a model framework. These indicators are trained by manual evaluation.
  • MQM is a multi-dimensional and multi-level manual evaluation method.
  • the COMET-MQM model is obtained by training the COMET model on MQM data.
  • QE Quality Estimation
  • the COMET-QE model is obtained by training the COMET model on QE data.
  • the BLEURT model is an alternative translation evaluation model for bilingual evaluation obtained using the Transformers model.
  • the target text can be evaluated for fluency based on at least one preset fluency evaluation model, and the evaluation results corresponding to the fluency evaluation indicators can be determined. For example, if there are multiple preset fluency evaluation models, you can randomly select a preset fluency evaluation model from the multiple preset fluency evaluation models, and compare the target text with the selected preset fluency evaluation model and the source text. Carry out quality assessment and use the obtained assessment results as the assessment results corresponding to the fluency assessment indicators. You can also perform quality assessment on the target text based on each preset fluency assessment model and source text, average the multiple assessment results obtained, and use the average assessment result as the assessment result corresponding to the fluency assessment index. In order to improve the accuracy of quality assessment.
  • the COMET-MQM cross-language multi-dimensional quality model needs to be evaluated based on the source text and the reference translation corresponding to the source text to obtain the evaluation results corresponding to the COMET-MQM fluency evaluation index.
  • the COMET-QE cross-language quality evaluation model needs to be evaluated based on the source text to obtain evaluation results corresponding to the COMET-QE fluency evaluation indicators.
  • the BLEURT bilingual evaluation alternative model needs to be evaluated based on the reference translation corresponding to the source text to obtain the evaluation results corresponding to the BLEURT fluency evaluation index.
  • the fidelity evaluation index can be used to characterize whether the details in the translated text faithfully reflect the meaning of the original text, that is, to evaluate detailed issues such as mistranslations, omissions, and emotional errors in the translation.
  • the preset loyalty evaluation model may be an evaluation model that is biased to evaluate the fidelity of the target text, so as to obtain evaluation results corresponding to the fidelity evaluation index.
  • the preset fidelity evaluation model may include but is not limited to: at least one of the OpenKiwi (Open-Source Machine Translation Quality Estimation in PyTorch) evaluation model and the Yisi-2 semantic evaluation model. Both the OpenKiwi evaluation model and the Yisi-2 semantic evaluation model need to be evaluated based on the source text to obtain evaluation results corresponding to the two loyalty evaluation indicators of OpenKiwi and Yisi-2.
  • the target text can be evaluated for fidelity based on at least one preset fidelity evaluation model, and the evaluation results corresponding to the fidelity evaluation indicators can be determined. For example, if there are multiple preset loyalty evaluation models, you can randomly select a preset loyalty evaluation model from the multiple preset loyalty evaluation models, and compare the target text with the selected preset loyalty evaluation model and the source text. Carry out quality assessment and use the obtained assessment results as the assessment results corresponding to the loyalty assessment indicators. It is also possible to perform quality assessment on the target text based on each preset loyalty assessment model and source text, average the multiple assessment results obtained, and use the average assessment result as the assessment result corresponding to the loyalty assessment index. In order to improve the accuracy of quality assessment.
  • S240 Input the language similarity between the source language and the target language into the preset network model, and determine the evaluation weight corresponding to the fluency evaluation index and the evaluation weight corresponding to the fidelity evaluation index based on the output of the preset network model.
  • the preset network model can directly output the evaluation weight corresponding to the fluency evaluation index and the evaluation weight corresponding to the fidelity evaluation index, or it can only output the evaluation weight corresponding to the fluency evaluation index or the evaluation weight corresponding to the fidelity evaluation index. , and determine the evaluation weight corresponding to another indicator based on the output evaluation weight.
  • This embodiment determines the optimal evaluation weight corresponding to the fluency evaluation index and the fidelity evaluation index based on linguistic similarity, which can effectively solve the problem of different biases in evaluating fluency and fidelity for translated texts in different languages. Improved the robustness of quality assessment.
  • S240 may include: determining the evaluation weight corresponding to the fluency evaluation index according to the output of the preset network model; determining the evaluation weight corresponding to the fidelity evaluation index based on the evaluation weight corresponding to the fluency evaluation index.
  • the weight output by the preset network model can be used as the evaluation weight corresponding to the fluency evaluation index. Since the sum of the two evaluation weights corresponding to the fluency evaluation index and the fidelity evaluation index is 1, the difference between 1 and the evaluation weight corresponding to the fluency evaluation index can be determined as the evaluation weight corresponding to the fidelity evaluation index. .
  • the evaluation result corresponding to the fluency evaluation index and the evaluation weight can be multiplied together, and the evaluation result corresponding to the fidelity evaluation index can be multiplied by the evaluation weight, and the two multiplied results can be added together to obtain
  • the weighted average result is used as the target evaluation result, so that fidelity and fluency can be integrated for comprehensive evaluation, avoiding the bias towards fidelity or fluency caused by a single evaluation index when evaluating translated texts, thus improving the accuracy of quality evaluation. and robustness.
  • the technical solution of this embodiment determines the optimal evaluation weight corresponding to the fluency evaluation index and the fidelity evaluation index based on linguistic similarity, and performs fusion processing based on each evaluation weight, thereby achieving It effectively solves the problem of different biases in evaluating fluency and fidelity of translated texts in different languages, and improves the accuracy and robustness of quality assessment.
  • the following is an example of a machine translation quality assessment device provided by the embodiment of the present application.
  • This device belongs to the same inventive concept as the machine translation quality assessment method of the above embodiments. Things that are not described in detail in the embodiments of the machine translation quality assessment device For details, please refer to the above embodiments of the machine translation quality assessment method.
  • Figure 3 is a schematic structural diagram of a machine translation quality assessment device provided by an embodiment of the present application. This embodiment can be applied to the situation of performing machine translation quality assessment on a pre-trained model, especially when the downstream task is a cross-cutting task such as a translation task. In the fine-tuning scenario during language tasks.
  • the device includes: a translation text pair acquisition module 310, an evaluation result determination module 320, an evaluation weight determination module 330, and an evaluation result fusion module 340.
  • the translation text pair acquisition module 310 is configured to acquire a translation text pair to be evaluated.
  • the translation text pair includes a source text corresponding to the source language and a translated target text corresponding to the target language;
  • the evaluation result determination module 320 is configured to obtain a translation text pair based on at least Two quality evaluation indicators and the source text are used to evaluate the quality of the target text and determine the evaluation results corresponding to each quality evaluation indicator;
  • the evaluation weight determination module 330 is set to determine each based on the language similarity between the source language and the target language. evaluation weights corresponding to each quality evaluation index;
  • the evaluation result fusion module 340 is configured to perform fusion processing on at least two evaluation results based on at least two evaluation weights to determine the target evaluation result of the translated text pair.
  • the technical solution of this embodiment is to perform quality assessment on the target text based on at least two quality assessment indicators and the source text in the translated text pair to be assessed, determine the assessment results corresponding to each quality assessment indicator, and evaluate the target text based on the source language and the source text.
  • the language similarity between the target languages determines the evaluation weight corresponding to each quality evaluation index, so as to fuse the evaluation results based on each evaluation weight to determine the target evaluation result of the translated text pair, so that at least two different
  • the evaluation results corresponding to the quality evaluation indicators are fused to comprehensively evaluate the translation quality to avoid biased evaluation results.
  • Each evaluation weight is determined based on the language similarity between the source language and the target language, so that the translation quality can be evaluated. Taking into account the language differences between different language pairs, it can effectively avoid the situation of large differences in the accuracy of translation evaluation of different language pairs, thereby ensuring the accuracy of translation evaluation of different language pairs.
  • the quality evaluation indicators include: fluency evaluation indicators and fidelity evaluation indicators; the evaluation result determination module 320 is set to:
  • the preset fluency assessment model includes: at least one of the COMET-MQM cross-language multi-dimensional quality model, the COMET-QE cross-language quality assessment model and the BLEURT bilingual assessment alternative model;
  • the preset loyalty evaluation model includes: at least one of the OpenKiwi evaluation model and the Yisi-2 semantic evaluation model.
  • the evaluation weight determination module 330 includes:
  • the language similarity input unit is configured to input the language similarity between the source language and the target language into a preset network model.
  • the preset network model is obtained by training the data and label evaluation results based on translation samples in advance;
  • the evaluation weight determination unit is configured to determine the evaluation weight corresponding to each quality evaluation index based on the output of the preset network model.
  • the evaluation weight determination unit is set to:
  • the evaluation weight corresponding to the fluency evaluation index is determined; based on the evaluation weight corresponding to the fluency evaluation index, the evaluation weight corresponding to the fidelity evaluation index is determined.
  • the device also includes:
  • the language similarity determination module is set as follows: before determining the evaluation weight corresponding to each quality evaluation indicator based on the language similarity between the source language and the target language, based on the preset multilingual model, based on the source corpus corresponding to the source language and The target corpus corresponding to the target language determines the source language representation vector corresponding to the source language and the target language representation vector corresponding to the target language; based on the source language representation vector and the target Language representation vector determines the language similarity between the source language and the target language.
  • the language similarity determination module is set to:
  • each source text in the source corpus corresponding to the source language into the preset multilingual model determine the source language representation vector corresponding to each source text, and determine the source language corresponding to the source language based on multiple source language representation vectors Representation vector; input each target text in the target corpus corresponding to the target language into the preset multilingual model, determine the target language representation vector corresponding to each target text, and determine the target language correspondence based on multiple target language representation vectors The target language representation vector.
  • the language similarity determination module is also configured to average multiple source language representation vectors, and the obtained average vector is determined to be the source language representation vector corresponding to the source language.
  • the machine translation quality assessment device provided by the embodiments of this application can execute the machine translation quality assessment method provided by any embodiment of this application, and has corresponding functional modules for executing the machine translation quality assessment method.
  • FIG. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present application. 4 illustrates a block diagram of an exemplary electronic device 12 suitable for implementing embodiments of the present application.
  • the electronic device 12 shown in FIG. 4 is only an example and should not bring any limitations to the functions and scope of use of the embodiments of the present application.
  • electronic device 12 is embodied in the form of a general-purpose computing device.
  • the components of electronic device 12 may include, but are not limited to: at least one processor or processing unit 16, system memory 28, and a bus 18 connecting various system components (including system memory 28 and processing unit 16).
  • Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a graphics accelerated port, a processor, or a local bus using any of a variety of bus structures.
  • these architectures include, but are not limited to, industry standard architectures (Industry Standard Architecture, ISA) bus, Micro Channel Architecture (MCA) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (Peripheral Component Interconnect, PCI) bus.
  • ISA Industry Standard Architecture
  • MCA Micro Channel Architecture
  • VESA Video Electronics Standards Association
  • PCI peripheral component interconnection
  • Electronic device 12 typically includes a variety of computer system readable media. These media can be any available media that can be accessed by electronic device 12, including volatile and nonvolatile media, removable and non-removable media.
  • System memory 28 may include computer system readable media in the form of volatile memory, such as random access memory (RAM) 30 and/or cache memory 32.
  • Electronic device 12 may further include other removable/non-removable, volatile/non-volatile computer system storage media.
  • storage system 34 may be used to read and write to non-removable, non-volatile magnetic media (not shown in Figure 4, commonly referred to as a "hard drive”).
  • a disk drive may be provided for reading and writing to removable non-volatile disks (e.g., "floppy disks"), and for removable non-volatile optical disks (e.g., Portable Compact Disk Read-Only Memory).
  • System memory 28 may include at least one program product having a set (eg, at least one) of program modules configured to perform the functions of various embodiments of the present application.
  • a program/utility 40 having a set of (at least one) program modules 42 may be stored, for example, in system memory 28. Each of these examples, or some combination, may include the implementation of a network environment.
  • Program modules 42 generally perform functions and/or methods in the embodiments described herein.
  • Electronic device 12 may also communicate with at least one external device 14 (e.g., keyboard, pointing device, display 24, etc.) and with at least one device that enables a user to interact with electronic device 12, and /or communicate with any device (eg, network card, modem, etc.) that enables the electronic device 12 to communicate with at least one other computing device. This communication may occur through an input/output (I/O) interface 22 .
  • the electronic device 12 can also communicate with at least one network (such as a local area network (LAN), a wide area network (WAN) and/or a public network such as the Internet) through the network adapter 20 .
  • network adapter 20 communicates with other modules of electronic device 12 via bus 18 .
  • the processing unit 16 executes various functional applications and data processing by running programs stored in the system memory 28, for example, implementing the steps of a machine translation quality assessment method provided by the embodiment of the present invention.
  • the method includes:
  • the translation text pair includes a source text corresponding to the source language and a translated target text corresponding to the target language;
  • At least two evaluation results are fused to determine the target evaluation result of the translated text pair.
  • processor can also implement the technical solution of the machine translation quality assessment method provided by any embodiment of the present application.
  • This embodiment provides a computer-readable storage medium on which a computer program is stored.
  • the program is executed by a processor, the machine translation quality assessment method steps as provided in any embodiment of the present application are implemented.
  • the method includes:
  • the translation text pair includes a source text corresponding to the source language and a translated target text corresponding to the target language;
  • At least two evaluation results are fused to determine the target evaluation result of the translated text pair.
  • the computer storage medium in the embodiment of the present application may be any combination of one or more computer-readable media.
  • the computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium.
  • the computer-readable storage medium may be, for example, but not limited to: an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination thereof.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
  • Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including but not Limited to: wireless, wires, optical cables, radio frequency (Radio Frequency, RF), etc., or any suitable combination of the above.
  • any appropriate medium including but not Limited to: wireless, wires, optical cables, radio frequency (Radio Frequency, RF), etc., or any suitable combination of the above.
  • Computer program code for performing operations of the present application may be written in one or more programming languages, including object-oriented programming languages such as Java, Smalltalk, C++, and conventional Procedural programming language—such as "C" or a similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as an Internet service provider through the Internet). connect).
  • LAN local area network
  • WAN wide area network
  • Internet service provider such as an Internet service provider through the Internet. connect
  • modules or steps of the present application can be implemented using general-purpose computing devices. They can be concentrated on a single computing device, or distributed on a network composed of multiple computing devices. Alternatively, they can be implemented with program codes executable by a computer device, so that they can be stored in a storage device and executed by the computing device, or they can be made into individual integrated circuit modules, or multiple modules among them. Or the steps are made into a single integrated circuit module. As such, the application is not limited to any specific combination of hardware and software.

Abstract

Disclosed in embodiments of the present application are a machine translation quality assessment method and apparatus, a device, and a storage medium, which are applied to the technical field of natural language processing. The method comprises: obtaining a translation text pair to be assessed, the translation text pair comprising a source text corresponding to a source language and a translated target text corresponding to a target language; on the basis of at least two quality assessment indexes and the source text, performing quality assessment on the target text, and determining an assessment result corresponding to each quality assessment index; on the basis of the language similarity between the source language and the target language, determining an assessment weight corresponding to each quality assessment index; and on the basis of at least two assessment weights, fusing at least two assessment results, and determining a target assessment result of the translation text pair.

Description

一种机器翻译质量评估方法、装置、设备和存储介质A machine translation quality assessment method, device, equipment and storage medium
本申请要求在2022年8月12日提交中国专利局、申请号为202210970061.5的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application with application number 202210970061.5, which was submitted to the China Patent Office on August 12, 2022. The entire content of this application is incorporated into this application by reference.
技术领域Technical field
本申请实施例涉及自然语言处理技术领域,例如涉及一种机器翻译质量评估方法、装置、设备和存储介质。The embodiments of this application relate to the technical field of natural language processing, for example, to a machine translation quality assessment method, device, equipment and storage medium.
背景技术Background technique
随着计算机技术的快速发展,往往需要对利用机器翻译模型翻译出的文本质量进行评估。With the rapid development of computer technology, it is often necessary to evaluate the quality of texts translated using machine translation models.
目前,可以基于翻译质量评估模型,对翻译后的文本进行质量评估。例如,基于句子级别标注数据进行训练获得的评估模型对翻译文本进行质量评估,获得的指标评估结果偏向于表征翻译文本的整体流畅度。或者,基于词级别标注数据进行训练获得的评估模型对翻译文本进行质量评估,获得的指标评估结果偏向于表征翻译文本的忠实度。Currently, the quality of the translated text can be assessed based on the translation quality assessment model. For example, an evaluation model trained based on sentence-level annotation data evaluates the quality of translated text, and the obtained index evaluation results tend to characterize the overall fluency of the translated text. Alternatively, the evaluation model obtained by training based on word-level annotation data is used to evaluate the quality of the translated text, and the obtained index evaluation results are biased towards characterizing the fidelity of the translated text.
然而,相关技术中至少存在如下问题:However, there are at least the following problems in related technologies:
利用每种翻译质量评估模型获得的指标评估结果仅能偏向于评估翻译文本单一层面的质量情况,比如翻译文本的整体流畅度或者忠实度,无法综合评估出翻译质量,并且对不同语种对的翻译文本均采用相同的评估方式,进而导致不同语种对的翻译评估准确度差异较大的现象。The index evaluation results obtained by using each translation quality evaluation model can only be biased to evaluate the quality of a single level of the translated text, such as the overall fluency or fidelity of the translated text. It cannot comprehensively evaluate the translation quality, and the translation of different language pairs cannot be comprehensively evaluated. The texts all use the same evaluation method, which leads to large differences in the accuracy of translation evaluations for different language pairs.
发明内容Contents of the invention
本申请实施例提供了一种机器翻译质量评估方法、装置、设备和存储介质,以综合评估翻译质量,并且保证不同语种对的翻译评估准确度。Embodiments of the present application provide a machine translation quality assessment method, device, equipment and storage medium to comprehensively assess translation quality and ensure the accuracy of translation assessment for different language pairs.
第一方面,本申请实施例提供了一种机器翻译质量评估方法,包括: In the first aspect, embodiments of the present application provide a machine translation quality assessment method, including:
获取待评估的翻译文本对,所述翻译文本对包括源语种对应的源文本和翻译后的目标语种对应的目标文本;Obtain a translation text pair to be evaluated, the translation text pair includes a source text corresponding to the source language and a translated target text corresponding to the target language;
基于至少两种质量评估指标和所述源文本,对所述目标文本进行质量评估,确定每种所述质量评估指标对应的评估结果;Perform a quality assessment on the target text based on at least two quality assessment indicators and the source text, and determine the assessment results corresponding to each of the quality assessment indicators;
基于所述源语种与所述目标语种之间的语种相似度,确定每种所述质量评估指标对应的评估权重;Based on the language similarity between the source language and the target language, determine the evaluation weight corresponding to each of the quality evaluation indicators;
基于至少两个评估权重,对至少两个评估结果进行融合处理,确定所述翻译文本对的目标评估结果。Based on at least two evaluation weights, at least two evaluation results are fused to determine the target evaluation result of the translated text pair.
第二方面,本申请实施例还提供了一种机器翻译质量评估装置,包括:In a second aspect, embodiments of the present application also provide a machine translation quality assessment device, including:
翻译文本对获取模块,设置为获取待评估的翻译文本对,所述翻译文本对包括源语种对应的源文本和翻译后的目标语种对应的目标文本;The translation text pair acquisition module is configured to acquire a translation text pair to be evaluated, where the translation text pair includes a source text corresponding to the source language and a translated target text corresponding to the target language;
评估结果确定模块,设置为基于至少两种质量评估指标和所述源文本,对所述目标文本进行质量评估,确定每种所述质量评估指标对应的评估结果;An evaluation result determination module is configured to perform a quality evaluation on the target text based on at least two quality evaluation indicators and the source text, and determine the evaluation results corresponding to each of the quality evaluation indicators;
评估权重确定模块,设置为基于所述源语种与所述目标语种之间的语种相似度,确定每种所述质量评估指标对应的评估权重;An evaluation weight determination module is configured to determine the evaluation weight corresponding to each of the quality evaluation indicators based on the language similarity between the source language and the target language;
评估结果融合模块,设置为基于至少两个评估权重,对至少两个评估结果进行融合处理,确定所述翻译文本对的目标评估结果。The evaluation result fusion module is configured to perform fusion processing on at least two evaluation results based on at least two evaluation weights, and determine the target evaluation result of the translated text pair.
第三方面,本申请实施例还提供了一种电子设备,所述电子设备包括:In a third aspect, embodiments of the present application further provide an electronic device, where the electronic device includes:
至少一个处理器;at least one processor;
存储器,设置为存储至少一个程序;a memory configured to store at least one program;
当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如本申请任意实施例所提供的机器翻译质量评估方法。When the at least one program is executed by the at least one processor, the at least one processor is caused to implement the machine translation quality evaluation method as provided in any embodiment of the present application.
第四方面,本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本申请任意实施例所提供的机器翻译质量评估方法。 In a fourth aspect, embodiments of the present application also provide a computer-readable storage medium on which a computer program is stored. When the program is executed by a processor, the machine translation quality assessment method as provided in any embodiment of the present application is implemented.
附图说明Description of drawings
图1是本申请实施例提供的一种机器翻译质量评估方法的流程图;Figure 1 is a flow chart of a machine translation quality assessment method provided by an embodiment of the present application;
图2是本申请实施例提供的另一种机器翻译质量评估方法的流程图;Figure 2 is a flow chart of another machine translation quality assessment method provided by an embodiment of the present application;
图3是本申请实施例提供的一种机器翻译质量评估装置的结构示意图;Figure 3 is a schematic structural diagram of a machine translation quality assessment device provided by an embodiment of the present application;
图4是本申请实施例提供的一种电子设备的结构示意图。FIG. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
具体实施方式Detailed ways
下面结合附图和实施例对本申请作详细说明。The present application will be described in detail below with reference to the accompanying drawings and examples.
图1为本申请实施例提供的一种机器翻译质量评估方法的流程图,本实施例可适用于对机器翻译模型翻译出的文本质量进行评估的情况。该方法可以由机器翻译质量评估装置来执行,该装置可以由软件和/或硬件的方式来实现,集成于电子设备中。如图1所示,该方法包括以下步骤:Figure 1 is a flow chart of a machine translation quality assessment method provided by an embodiment of the present application. This embodiment can be applied to the situation of evaluating the quality of text translated by a machine translation model. The method can be performed by a machine translation quality assessment device, which can be implemented in software and/or hardware and integrated into an electronic device. As shown in Figure 1, the method includes the following steps:
S110、获取待评估的翻译文本对,翻译文本对包括源语种对应的源文本和翻译后的目标语种对应的目标文本。S110. Obtain a translation text pair to be evaluated. The translation text pair includes a source text corresponding to the source language and a translated target text corresponding to the target language.
其中,源语种可以是指待翻译的语种。目标语种是指翻译后的语种。源文本可以是指利用源语种表达的原文,即待翻译的句子。目标文本可以是指利用目标语种表达的与源文本相同含义的译文,即翻译后的句子。Among them, the source language may refer to the language to be translated. The target language refers to the translated language. The source text may refer to the original text expressed in the source language, that is, the sentence to be translated. The target text can refer to the translation that expresses the same meaning as the source text in the target language, that is, the translated sentence.
示例性地,可以将源文本输入至机器翻译模型中进行翻译,获得机器翻译模型输出的目标文本,从而获得待评估的翻译文本对。For example, the source text can be input into the machine translation model for translation, and the target text output by the machine translation model can be obtained, thereby obtaining the translated text pair to be evaluated.
S120、基于至少两种质量评估指标和源文本,对目标文本进行质量评估,确定每种质量评估指标对应的评估结果。S120. Based on at least two quality assessment indicators and the source text, perform a quality assessment on the target text, and determine the assessment results corresponding to each quality assessment indicator.
其中,质量评估指标可以是用于评估目标文本的翻译质量的指标。不同种类的质量评估指标偏向于评估不同层面的翻译情况。例如,质量评估指标可以是但不限于:偏向于评估目标文本整体流畅度的流畅度评估指标,或者偏向于评估目标文本忠实度的忠实度评估指标。其中,流畅度评估指标可以用于表征翻译文本的整体流畅度、是否符合表述习惯等信息。忠实度评估指标可以用于 表征翻译文本中的细节是否忠实地反映了原文的意思,也就是对译文中的错翻、漏翻、情感错误等细节问题进行评判。每种质量评估指标可以对应至少一个质量评估模型,以便利用至少一个质量评估模型确定出该种质量评估指标对应的评估结果。本实施例可以利用打分方式表明评估结果。例如,评估结果中的分值越大,则表明该种质量评估指标所对应的质量程度越高,比如,流畅度越高,或者忠实度越高。Among them, the quality evaluation index may be an index used to evaluate the translation quality of the target text. Different types of quality assessment indicators tend to evaluate different levels of translation. For example, the quality evaluation index may be, but is not limited to: a fluency evaluation index that is biased to evaluate the overall fluency of the target text, or a fidelity evaluation index that is biased to evaluate the fidelity of the target text. Among them, the fluency evaluation index can be used to characterize the overall fluency of the translated text and whether it conforms to the presentation habits and other information. Fidelity evaluation metrics can be used to It indicates whether the details in the translated text faithfully reflect the meaning of the original text, that is, it evaluates details such as mistranslations, omissions, and emotional errors in the translation. Each quality evaluation index can correspond to at least one quality evaluation model, so that the evaluation result corresponding to the quality evaluation index is determined using at least one quality evaluation model. This embodiment can use a scoring method to indicate the evaluation results. For example, the larger the score in the evaluation result, the higher the quality level corresponding to the quality evaluation indicator, for example, the higher the fluency or the higher the fidelity.
示例性地,基于业务需求可以选取不同的至少两种质量评估指标,针对每种质量评估指标而言,可以基于该种质量评估指标对应的至少一个质量评估模型,对目标文本进行质量评估,确定出该种质量评估指标对应的评估结果。例如,若该种质量评估指标对应多个质量评估模型,则可以从多个质量评估模型中随机选取一个质量评估模型,基于选取的质量评估模型和源文本对目标文本进行质量评估,并将获得的评估结果作为该种质量评估指标对应的评估结果;也可以基于每个质量评估模型和源文本,均对目标文本进行质量评估,并将获得的多个评估结果进行平均处理,获得该种质量评估指标对应的平均评估结果,以便提高质量评估的准确性。For example, at least two different quality assessment indicators can be selected based on business needs. For each quality assessment indicator, the quality of the target text can be assessed based on at least one quality assessment model corresponding to the quality assessment indicator, and the quality assessment index can be determined. Output the evaluation results corresponding to this quality evaluation indicator. For example, if this kind of quality assessment indicator corresponds to multiple quality assessment models, you can randomly select a quality assessment model from the multiple quality assessment models, conduct a quality assessment on the target text based on the selected quality assessment model and the source text, and obtain The evaluation results are used as the evaluation results corresponding to the quality evaluation index; the quality of the target text can also be evaluated based on each quality evaluation model and source text, and the multiple evaluation results obtained are averaged to obtain the quality The average evaluation results corresponding to the evaluation indicators are used to improve the accuracy of quality evaluation.
S130、基于源语种与目标语种之间的语种相似度,确定每种质量评估指标对应的评估权重。S130. Based on the language similarity between the source language and the target language, determine the evaluation weight corresponding to each quality evaluation index.
其中,语种相似度可以是指源语种和目标语种在语系、词汇和语法结构等方面的语言学相似度。Among them, language similarity can refer to the linguistic similarity between the source language and the target language in terms of language family, vocabulary and grammatical structure.
示例性地,可以预先确定每两个语种之间的语种相似度,从而可以直接获取源语种与目标语种之间的语种相似度,也可以实时确定出源语种与目标语种之间的语种相似度。基于源语种与目标语种之间的语种相似度,可以确定出不同质量评估指标对应的最优评估权重,也就是基于不同的语种对可以确定出不同的评估权重,从而可以考虑到不同语种对之间的语种差异,有效避免不同语种对的翻译评估准确度差异较大的情况,进而保证不同语种对的翻译评估准确度,提高了质量翻译评估的通用性。 For example, the language similarity between each two languages can be determined in advance, so that the language similarity between the source language and the target language can be directly obtained, or the language similarity between the source language and the target language can be determined in real time. . Based on the language similarity between the source language and the target language, the optimal evaluation weights corresponding to different quality evaluation indicators can be determined. That is, different evaluation weights can be determined based on different language pairs, so that the differences between different language pairs can be taken into account. This method can effectively avoid the situation where the accuracy of translation evaluation of different language pairs is greatly different, thereby ensuring the accuracy of translation evaluation of different language pairs and improving the versatility of quality translation evaluation.
S140、基于至少两个评估权重,对至少两个评估结果进行融合处理,确定翻译文本对的目标评估结果。S140. Based on at least two evaluation weights, perform fusion processing on at least two evaluation results to determine the target evaluation result of the translated text pair.
示例性地,可以将每种质量评估指标对应的评估结果和相应的评估权重进行相乘,并将各个相乘结果进行相加,获得的加权平均结果作为目标评估结果,从而可以融合至少两种质量评估指标对应的评估结果,综合评估出翻译质量,避免单一评估指标对翻译文本评价时产生的偏向情况,进而提高了质量评估的准确性和鲁棒性。For example, the evaluation results corresponding to each quality evaluation index and the corresponding evaluation weights can be multiplied, and the multiplication results can be added together, and the weighted average result obtained can be used as the target evaluation result, so that at least two types of evaluation results can be fused. The evaluation results corresponding to the quality evaluation indicators comprehensively evaluate the translation quality, avoiding the bias caused by a single evaluation indicator in evaluating the translated text, thereby improving the accuracy and robustness of the quality assessment.
本实施例的技术方案,通过基于至少两种质量评估指标和待评估的翻译文本对中的源文本,对目标文本进行质量评估,确定每种质量评估指标对应的评估结果,并基于源语种与目标语种之间的语种相似度,确定每种质量评估指标对应的评估权重,从而基于至少两个评估权重对至少两个评估结果进行融合处理,确定翻译文本对的目标评估结果,从而可以将不同的至少两种质量评估指标对应的评估结果进行融合处理,综合评估出翻译质量,避免产生的评估结果偏向情况,并且基于源语种与目标语种之间的语种相似度确定各个评估权重,从而可以考虑到不同语种对之间的语种差异,有效避免不同语种对的翻译评估准确度差异较大的情况,进而保证不同语种对的翻译评估准确度。The technical solution of this embodiment is to perform quality assessment on the target text based on at least two quality assessment indicators and the source text in the translated text pair to be assessed, determine the assessment results corresponding to each quality assessment indicator, and evaluate the target text based on the source language and the source text. The language similarity between the target languages determines the evaluation weight corresponding to each quality evaluation index, so as to fuse at least two evaluation results based on at least two evaluation weights to determine the target evaluation result of the translated text pair, so that different evaluation results can be combined The evaluation results corresponding to at least two quality evaluation indicators are fused to comprehensively evaluate the translation quality to avoid biased evaluation results, and each evaluation weight is determined based on the language similarity between the source language and the target language, so that it can be considered It can effectively avoid the situation where the translation evaluation accuracy of different language pairs differs greatly, thereby ensuring the accuracy of translation evaluation of different language pairs.
在上述技术方案的基础上,S130可以包括:将源语种与目标语种之间的语种相似度输入至预设网络模型中,预设网络模型是预先基于翻译样本对数据和标签评估结果进行训练获得的;根据预设网络模型的输出,确定每种质量评估指标对应的评估权重。Based on the above technical solution, S130 may include: inputting the language similarity between the source language and the target language into a preset network model. The preset network model is obtained by pre-training the data and label evaluation results based on translation samples. ; Based on the output of the preset network model, determine the evaluation weight corresponding to each quality evaluation indicator.
其中,预设网络模型可以设置为表征每种质量评估指标对应的最优评估权重与语种相似度之间的映射关系,该映射关系可以通过基于翻译样本对数据和标签评估结果进行学习获得的。例如,可以基于至少两种质量评估指标和翻译样本对数据中的源样本文本,对翻译样本对数据中的目标样本文本进行质量评估,获得每个翻译样本对数据对应的至少两个样本评估结果,并将样本语种对之间的语种相似度输入至待训练的预设网络模型中,基于预设网络模型的输出, 确定出每种质量评估指标对应的样本评估权重,并基于所述至少两个样本评估结果,对至少两个样本评估权重进行融合处理,获得目标样本评估结果,基于目标样本评估结果和标签评估结果确定出训练误差,并将训练误差反向传播至待训练的预设网络模型,调整预设网络模型中的模型参数,直到满足预设收敛条件,比如迭代次数达到预设次数或者训练误差收敛时,确定预设网络模型训练结束。Among them, the preset network model can be set to represent the mapping relationship between the optimal evaluation weight corresponding to each quality evaluation index and the language similarity. This mapping relationship can be obtained by learning the data and label evaluation results based on translation samples. For example, quality assessment can be performed on the target sample text in the translation sample pair data based on at least two quality assessment indicators and the source sample text in the translation sample pair data, and at least two sample evaluation results corresponding to each translation sample pair data can be obtained. , and input the language similarity between the sample language pairs into the preset network model to be trained, based on the output of the preset network model, Determine the sample evaluation weight corresponding to each quality evaluation indicator, and perform a fusion process on the at least two sample evaluation weights based on the at least two sample evaluation results to obtain the target sample evaluation result, based on the target sample evaluation result and the label evaluation result Determine the training error, back propagate the training error to the preset network model to be trained, and adjust the model parameters in the preset network model until the preset convergence conditions are met, such as when the number of iterations reaches the preset number or the training error converges , confirm that the preset network model training is completed.
需要说明的是,预设网络模型的网络架构可以基于业务需求进行设置。例如,预设网络模型可以直接输出每种质量评估指标对应的评估权重,也可以仅输出一种质量评估指标对应的评估权重,并基于输出的该评估权重确定出其他质量评估指标对应的评估权重。例如,若存在两种质量评估指标A和B,且预设网络模型是设置为输出质量评估指标A对应的评估权重,则由于A和B对应的评估权重之和为1,从而可以将1与指标A对应的评估权重之间的差值确定为指标B对应的评估权重。It should be noted that the network architecture of the preset network model can be set based on business requirements. For example, the preset network model can directly output the evaluation weight corresponding to each quality evaluation indicator, or it can only output the evaluation weight corresponding to one quality evaluation indicator, and determine the evaluation weights corresponding to other quality evaluation indicators based on the output evaluation weight. . For example, if there are two quality evaluation indicators A and B, and the preset network model is set to output the evaluation weight corresponding to the quality evaluation index A, then since the sum of the evaluation weights corresponding to A and B is 1, 1 and The difference between the evaluation weights corresponding to indicator A is determined as the evaluation weight corresponding to indicator B.
在上述技术方案的基础上,在S130之前,还可以包括:基于预设多语种模型,根据源语种对应的源语料库和目标语种对应的目标语料库,确定源语种对应的源语种表征向量和目标语种对应的目标语种表征向量;基于源语种表征向量和目标语种表征向量,确定源语种与目标语种之间的语言相似度。On the basis of the above technical solution, before S130, it may also include: based on the preset multilingual model, according to the source corpus corresponding to the source language and the target corpus corresponding to the target language, determining the source language representation vector and the target language corresponding to the source language. The corresponding target language representation vector; based on the source language representation vector and the target language representation vector, determine the language similarity between the source language and the target language.
其中,预设多语种模型可以是对不同语种中的文本进行语言处理的模型。例如,预设多语种模型可以是但不限于XLM-RoBERTa模型。The preset multilingual model may be a model that performs language processing on texts in different languages. For example, the preset multilingual model may be but is not limited to the XLM-RoBERTa model.
示例性地,基于预设多语种模型和源语料库可以确定出用于表征源语言学的源语种表征向量vi。基于预设多语种模型和目标语料库可以确定出用于表征目标语言学的目标语种表征向量vj。本实施例可以将源语种表征向量vi与目标语种表征向量vj之间的余弦距离cos(vi,vj)确定为源语种与目标语种之间的语言相似度。For example, the source language representation vector vi used to represent the source linguistics can be determined based on the preset multilingual model and the source corpus. The target language representation vector v j used to represent the target linguistics can be determined based on the preset multilingual model and the target corpus. In this embodiment, the cosine distance cos(v i , v j ) between the source language representation vector vi and the target language representation vector v j can be determined as the language similarity between the source language and the target language.
示例性地,基于预设多语种模型,根据源语种对应的源语料库和目标语种对应的目标语料库,确定源语种对应的源语种表征向量和目标语种对应的目标 语种表征向量,可以包括:Exemplarily, based on the preset multilingual model, according to the source corpus corresponding to the source language and the target corpus corresponding to the target language, the source language representation vector corresponding to the source language and the target corresponding to the target language are determined. Language representation vector can include:
将源语种对应的源语料库中的每个源文本输入至预设多语种模型中,确定每个源文本对应的源语言表征向量,并基于多个源语言表征向量,确定源语种对应的源语种表征向量;将目标语种对应的目标语料库中的每个目标文本输入至预设多语种模型中,确定每个目标文本对应的目标语言表征向量,并基于多个目标语言表征向量,确定目标语种对应的目标语种表征向量。Input each source text in the source corpus corresponding to the source language into the preset multilingual model, determine the source language representation vector corresponding to each source text, and determine the source language corresponding to the source language based on multiple source language representation vectors Representation vector; input each target text in the target corpus corresponding to the target language into the preset multilingual model, determine the target language representation vector corresponding to each target text, and determine the target language correspondence based on multiple target language representation vectors The target language representation vector.
示例性地,基于多个源语言表征向量,确定源语种对应的源语种表征向量,可以包括:对多个源语言表征向量进行平均处理,获得的平均向量确定为源语种对应的源语种表征向量。Exemplarily, based on multiple source language representation vectors, determining the source language representation vector corresponding to the source language may include: averaging the multiple source language representation vectors, and the obtained average vector is determined as the source language representation vector corresponding to the source language. .
示例性地,基于多个目标语言表征向量,确定目标语种对应的目标语种表征向量,可以包括:对多个目标语言表征向量进行平均处理,获得的平均向量确定为目标语种对应的目标语种表征向量。Exemplarily, based on multiple target language representation vectors, determining the target language representation vector corresponding to the target language may include: averaging the multiple target language representation vectors, and the obtained average vector is determined as the target language representation vector corresponding to the target language. .
示例性地,可以将源语料库中的每个源文本输入至预先训练获得的预设多语种模型中,并基于预设多语种模型的输出,确定每个源文本对应的源语言表征向量R(xim),其中,i代表的是源语种,m代表的是第m个源文本。通过对多个源语言表征向量R(xim)进行平均处理,获得的平均向量确定为源语种表征向量vi,即其中,ni代表的是源文本数量。同理,可以基于预设多语种模型确定出每个目标文本对应的目标语言表征向量R(xjm),其中,j代表的是目标语种。通过对多个目标语言表征向量R(xjm)进行平均处理,获得的平均向量确定为目标语种表征向量vj,即其中,nj代表的是目标文本数量。For example, each source text in the source corpus can be input into a preset multilingual model obtained by pre-training, and based on the output of the preset multilingual model, the source language representation vector R ( x im ), where i represents the source language and m represents the m-th source text. By averaging multiple source language representation vectors R(x im ), the average vector obtained is determined as the source language representation vector v i , that is Among them, n i represents the number of source texts. In the same way, the target language representation vector R(x jm ) corresponding to each target text can be determined based on the preset multilingual model, where j represents the target language. By averaging multiple target language representation vectors R(x jm ), the average vector obtained is determined as the target language representation vector v j , that is Among them, n j represents the number of target texts.
图2为本申请实施例提供的另一种机器翻译质量评估方法的流程图,本实施例在上述各实施例的基础上,在质量评估指标包括:流畅度评估指标和忠实度评估指标时,对翻译质量的整个评估过程进行了详细描述。其中与上述各实施例相同或相应的术语的解释在此不再赘述。 Figure 2 is a flow chart of another machine translation quality evaluation method provided by an embodiment of the present application. This embodiment is based on the above embodiments and when the quality evaluation indicators include: fluency evaluation indicators and fidelity evaluation indicators, The entire evaluation process of translation quality is described in detail. The explanations of terms that are the same as or corresponding to the above embodiments will not be repeated here.
参见图2,本实施例提供的另一种机器翻译质量评估方法包括以下步骤:Referring to Figure 2, another machine translation quality assessment method provided in this embodiment includes the following steps:
S210、获取待评估的翻译文本对,翻译文本对包括源语种对应的源文本和翻译后的目标语种对应的目标文本。S210. Obtain a translation text pair to be evaluated. The translation text pair includes a source text corresponding to the source language and a translated target text corresponding to the target language.
S220、基于至少一个预设流畅度评估模型和源文本,对目标文本进行流畅度评估,确定流畅度评估指标对应的评估结果。S220. Based on at least one preset fluency evaluation model and the source text, perform a fluency evaluation on the target text, and determine the evaluation results corresponding to the fluency evaluation indicators.
其中,流畅度评估指标可以用于表征翻译文本的整体流畅度、是否符合表述习惯等信息。预设流畅度评估模型可以是用于偏向于评估目标文本整体流畅度的评估模型,以便获得流畅度评估指标对应的评估结果。预设流畅度评估模型可以包括但不限于:COMET-MQM(Multidimensional Quality Metric)跨语种多维质量模型、COMET-QE跨语种质量评估模型和BLEURT(Bilingual Evaluation Understudy with Representations from Transformers)双语评估替代模型中的至少一种。其中,COMET(Crosslingual Optimized Metric for Evaluation of Translation)是一系列翻译评估模型的总称,COMET是一种模型框架,这些指标都由人工评估训练而成。MQM是一种多维度、多层次进行人工评估方法,COMET-MQM模型是将COMET这一模型在MQM数据上训练后得到的。QE(Quality Estimation)是翻译评价领域的一种特定任务,这一任务不允许使用参考译文,只能基于源文本进行评价。COMET-QE模型是将COMET这一模型在QE数据上训练后得到的。BLEURT模型是使用Transformers模型得到的双语评估替代的翻译评估模型。Among them, the fluency evaluation index can be used to characterize the overall fluency of the translated text and whether it conforms to the presentation habits and other information. The preset fluency evaluation model may be an evaluation model that is biased to evaluate the overall fluency of the target text, so as to obtain evaluation results corresponding to the fluency evaluation index. Preset fluency assessment models may include but are not limited to: COMET-MQM (Multidimensional Quality Metric) cross-language multidimensional quality model, COMET-QE cross-language quality assessment model and BLEURT (Bilingual Evaluation Understudy with Representations from Transformers) bilingual evaluation alternative model of at least one. Among them, COMET (Crosslingual Optimized Metric for Evaluation of Translation) is the general name for a series of translation evaluation models. COMET is a model framework. These indicators are trained by manual evaluation. MQM is a multi-dimensional and multi-level manual evaluation method. The COMET-MQM model is obtained by training the COMET model on MQM data. QE (Quality Estimation) is a specific task in the field of translation evaluation. This task does not allow the use of reference translations and can only be evaluated based on the source text. The COMET-QE model is obtained by training the COMET model on QE data. The BLEURT model is an alternative translation evaluation model for bilingual evaluation obtained using the Transformers model.
示例性地,可以基于至少一个预设流畅度评估模型,对目标文本进行流畅度评估,确定出流畅度评估指标对应的评估结果。例如,若存在多个预设流畅度评估模型,则可以从多个预设流畅度评估模型中随机选取一个预设流畅度评估模型,基于选取的预设流畅度评估模型和源文本对目标文本进行质量评估,并将获得的评估结果作为流畅度评估指标对应的评估结果。也可以基于每个预设流畅度评估模型和源文本,均对目标文本进行质量评估,并将获得的多个评估结果进行平均处理,并将平均评估结果作为流畅度评估指标对应的评估结果, 以便提高质量评估的准确性。For example, the target text can be evaluated for fluency based on at least one preset fluency evaluation model, and the evaluation results corresponding to the fluency evaluation indicators can be determined. For example, if there are multiple preset fluency evaluation models, you can randomly select a preset fluency evaluation model from the multiple preset fluency evaluation models, and compare the target text with the selected preset fluency evaluation model and the source text. Carry out quality assessment and use the obtained assessment results as the assessment results corresponding to the fluency assessment indicators. You can also perform quality assessment on the target text based on each preset fluency assessment model and source text, average the multiple assessment results obtained, and use the average assessment result as the assessment result corresponding to the fluency assessment index. In order to improve the accuracy of quality assessment.
需要说明的是,不同预设流畅度评估模型对应的评估方式不同,从而在对目标文本进行质量评估时,所需要的参考文本可能不同。例如,COMET-MQM跨语种多维质量模型需要基于源文本和源文本对应的参考译文进行评估,获得COMET-MQM流畅度评估指标对应的评估结果。COMET-QE跨语种质量评估模型需要基于源文本进行评估,获得COMET-QE流畅度评估指标对应的评估结果。BLEURT双语评估替代模型需要基于源文本对应的参考译文进行评估,获得BLEURT流畅度评估指标对应的评估结果。It should be noted that different preset fluency evaluation models correspond to different evaluation methods, so when evaluating the quality of the target text, the reference texts required may be different. For example, the COMET-MQM cross-language multi-dimensional quality model needs to be evaluated based on the source text and the reference translation corresponding to the source text to obtain the evaluation results corresponding to the COMET-MQM fluency evaluation index. The COMET-QE cross-language quality evaluation model needs to be evaluated based on the source text to obtain evaluation results corresponding to the COMET-QE fluency evaluation indicators. The BLEURT bilingual evaluation alternative model needs to be evaluated based on the reference translation corresponding to the source text to obtain the evaluation results corresponding to the BLEURT fluency evaluation index.
S230、基于至少一个预设忠实度评估模型和源文本,对目标文本进行忠实度评估,确定忠实度评估指标对应的评估结果。S230. Based on at least one preset loyalty assessment model and the source text, conduct a loyalty assessment on the target text and determine an assessment result corresponding to the loyalty assessment index.
其中,忠实度评估指标可以用于表征翻译文本中的细节是否忠实地反映了原文的意思,也就是对译文中的错翻、漏翻、情感错误等细节问题进行评判。预设忠实度评估模型可以是用于偏向于评估目标文本忠实度的评估模型,以便获得忠实度评估指标对应的评估结果。预设忠实度评估模型可以包括但不限于:OpenKiwi(Open-Source Machine Translation Quality Estimation in PyTorch)评估模型和Yisi-2语义评估模型中的至少一种。OpenKiwi评估模型和Yisi-2语义评估模型均需要基于源文本进行评估,获得OpenKiwi和Yisi-2这两个忠实度评估指标对应的评估结果。Among them, the fidelity evaluation index can be used to characterize whether the details in the translated text faithfully reflect the meaning of the original text, that is, to evaluate detailed issues such as mistranslations, omissions, and emotional errors in the translation. The preset loyalty evaluation model may be an evaluation model that is biased to evaluate the fidelity of the target text, so as to obtain evaluation results corresponding to the fidelity evaluation index. The preset fidelity evaluation model may include but is not limited to: at least one of the OpenKiwi (Open-Source Machine Translation Quality Estimation in PyTorch) evaluation model and the Yisi-2 semantic evaluation model. Both the OpenKiwi evaluation model and the Yisi-2 semantic evaluation model need to be evaluated based on the source text to obtain evaluation results corresponding to the two loyalty evaluation indicators of OpenKiwi and Yisi-2.
示例性地,可以基于至少一个预设忠实度评估模型,对目标文本进行忠实度评估,确定出忠实度评估指标对应的评估结果。例如,若存在多个预设忠实度评估模型,则可以从多个预设忠实度评估模型中随机选取一个预设忠实度评估模型,基于选取的预设忠实度评估模型和源文本对目标文本进行质量评估,并将获得的评估结果作为忠实度评估指标对应的评估结果。也可以基于每个预设忠实度评估模型和源文本,均对目标文本进行质量评估,并将获得的多个评估结果进行平均处理,并将平均评估结果作为忠实度评估指标对应的评估结果,以便提高质量评估的准确性。 For example, the target text can be evaluated for fidelity based on at least one preset fidelity evaluation model, and the evaluation results corresponding to the fidelity evaluation indicators can be determined. For example, if there are multiple preset loyalty evaluation models, you can randomly select a preset loyalty evaluation model from the multiple preset loyalty evaluation models, and compare the target text with the selected preset loyalty evaluation model and the source text. Carry out quality assessment and use the obtained assessment results as the assessment results corresponding to the loyalty assessment indicators. It is also possible to perform quality assessment on the target text based on each preset loyalty assessment model and source text, average the multiple assessment results obtained, and use the average assessment result as the assessment result corresponding to the loyalty assessment index. In order to improve the accuracy of quality assessment.
S240、将源语种与目标语种之间的语种相似度输入至预设网络模型中,并根据预设网络模型的输出,确定流畅度评估指标对应的评估权重和忠实度评估指标对应的评估权重。S240. Input the language similarity between the source language and the target language into the preset network model, and determine the evaluation weight corresponding to the fluency evaluation index and the evaluation weight corresponding to the fidelity evaluation index based on the output of the preset network model.
示例性地,预设网络模型可以直接输出流畅度评估指标对应的评估权重和忠实度评估指标对应的评估权重,也可以仅输出流畅度评估指标对应的评估权重或者忠实度评估指标对应的评估权重,并基于输出的该评估权重确定出另一个指标对应的评估权重。本实施例通过基于语言学相似度确定流畅度评估指标和忠实度评估指标对应的最优评估权重,可以有效解决对于不同语种的翻译文本在评价流畅度和忠实度时产生不同偏向性的问题,提升了质量评估的鲁棒性。For example, the preset network model can directly output the evaluation weight corresponding to the fluency evaluation index and the evaluation weight corresponding to the fidelity evaluation index, or it can only output the evaluation weight corresponding to the fluency evaluation index or the evaluation weight corresponding to the fidelity evaluation index. , and determine the evaluation weight corresponding to another indicator based on the output evaluation weight. This embodiment determines the optimal evaluation weight corresponding to the fluency evaluation index and the fidelity evaluation index based on linguistic similarity, which can effectively solve the problem of different biases in evaluating fluency and fidelity for translated texts in different languages. Improved the robustness of quality assessment.
示例性地,S240可以包括:根据预设网络模型的输出,确定流畅度评估指标对应的评估权重;基于流畅度评估指标对应的评估权重,确定出忠实度评估指标对应的评估权重。For example, S240 may include: determining the evaluation weight corresponding to the fluency evaluation index according to the output of the preset network model; determining the evaluation weight corresponding to the fidelity evaluation index based on the evaluation weight corresponding to the fluency evaluation index.
示例性地,在预设网络模型是用于预测流畅度评估指标对应的评估权重的模型时,可以将预设网络模型输出的权重作为流畅度评估指标对应的评估权重。由于流畅度评估指标和忠实度评估指标所对应的两个评估权重之和为1,从而可以将1与流畅度评估指标对应的评估权重之间的差值确定为忠实度评估指标对应的评估权重。For example, when the preset network model is a model used to predict the evaluation weight corresponding to the fluency evaluation index, the weight output by the preset network model can be used as the evaluation weight corresponding to the fluency evaluation index. Since the sum of the two evaluation weights corresponding to the fluency evaluation index and the fidelity evaluation index is 1, the difference between 1 and the evaluation weight corresponding to the fluency evaluation index can be determined as the evaluation weight corresponding to the fidelity evaluation index. .
S250、基于至少两个评估权重,对至少两个评估结果进行融合处理,确定翻译文本对的目标评估结果。S250. Based on at least two evaluation weights, perform fusion processing on at least two evaluation results to determine the target evaluation result of the translated text pair.
示例性地,可以将流畅度评估指标对应的评估结果和评估权重进行相乘,以及将忠实度评估指标对应的评估结果和评估权重进行相乘,并将两个相乘结果进行相加,获得的加权平均结果作为目标评估结果,从而可以融合忠实度和流畅度进行综合评估,避免单一评价指标对翻译文本评价时产生的对忠实度或流畅度的偏向问题,进而提高了质量评估的准确性和鲁棒性。For example, the evaluation result corresponding to the fluency evaluation index and the evaluation weight can be multiplied together, and the evaluation result corresponding to the fidelity evaluation index can be multiplied by the evaluation weight, and the two multiplied results can be added together to obtain The weighted average result is used as the target evaluation result, so that fidelity and fluency can be integrated for comprehensive evaluation, avoiding the bias towards fidelity or fluency caused by a single evaluation index when evaluating translated texts, thus improving the accuracy of quality evaluation. and robustness.
本实施例的技术方案,通过基于语言学相似度确定流畅度评估指标和忠实度评估指标对应的最优评估权重,并基于各个评估权重进行融合处理,从而可 以有效解决对于不同语种的翻译文本在评价流畅度和忠实度时产生不同偏向性的问题,提升了质量评估的准确性和鲁棒性。The technical solution of this embodiment determines the optimal evaluation weight corresponding to the fluency evaluation index and the fidelity evaluation index based on linguistic similarity, and performs fusion processing based on each evaluation weight, thereby achieving It effectively solves the problem of different biases in evaluating fluency and fidelity of translated texts in different languages, and improves the accuracy and robustness of quality assessment.
以下是本申请实施例提供的机器翻译质量评估装置的实施例,该装置与上述各实施例的机器翻译质量评估方法属于同一个发明构思,在机器翻译质量评估装置的实施例中未详尽描述的细节内容,可以参考上述机器翻译质量评估方法的实施例。The following is an example of a machine translation quality assessment device provided by the embodiment of the present application. This device belongs to the same inventive concept as the machine translation quality assessment method of the above embodiments. Things that are not described in detail in the embodiments of the machine translation quality assessment device For details, please refer to the above embodiments of the machine translation quality assessment method.
图3为本申请实施例提供的一种机器翻译质量评估装置的结构示意图,本实施例可适用于对预训练模型进行机器翻译质量评估的情况,尤其是适用于下游任务为翻译任务这类跨语种任务时的微调场景中。如图3所示,该装置包括:翻译文本对获取模块310、评估结果确定模块320、评估权重确定模块330和评估结果融合模块340。Figure 3 is a schematic structural diagram of a machine translation quality assessment device provided by an embodiment of the present application. This embodiment can be applied to the situation of performing machine translation quality assessment on a pre-trained model, especially when the downstream task is a cross-cutting task such as a translation task. In the fine-tuning scenario during language tasks. As shown in Figure 3, the device includes: a translation text pair acquisition module 310, an evaluation result determination module 320, an evaluation weight determination module 330, and an evaluation result fusion module 340.
其中,翻译文本对获取模块310,设置为获取待评估的翻译文本对,翻译文本对包括源语种对应的源文本和翻译后的目标语种对应的目标文本;评估结果确定模块320,设置为基于至少两种质量评估指标和源文本,对目标文本进行质量评估,确定每种质量评估指标对应的评估结果;评估权重确定模块330,设置为基于源语种与目标语种之间的语种相似度,确定每种质量评估指标对应的评估权重;评估结果融合模块340,设置为基于至少两个评估权重,对至少两个评估结果进行融合处理,确定翻译文本对的目标评估结果。Among them, the translation text pair acquisition module 310 is configured to acquire a translation text pair to be evaluated. The translation text pair includes a source text corresponding to the source language and a translated target text corresponding to the target language; the evaluation result determination module 320 is configured to obtain a translation text pair based on at least Two quality evaluation indicators and the source text are used to evaluate the quality of the target text and determine the evaluation results corresponding to each quality evaluation indicator; the evaluation weight determination module 330 is set to determine each based on the language similarity between the source language and the target language. evaluation weights corresponding to each quality evaluation index; the evaluation result fusion module 340 is configured to perform fusion processing on at least two evaluation results based on at least two evaluation weights to determine the target evaluation result of the translated text pair.
本实施例的技术方案,通过基于至少两种质量评估指标和待评估的翻译文本对中的源文本,对目标文本进行质量评估,确定每种质量评估指标对应的评估结果,并基于源语种与目标语种之间的语种相似度,确定每种质量评估指标对应的评估权重,从而基于各个评估权重对各个评估结果进行融合处理,确定翻译文本对的目标评估结果,从而可以将不同的至少两种质量评估指标对应的评估结果进行融合处理,综合评估出翻译质量,避免产生的评估结果偏向情况,并且基于源语种与目标语种之间的语种相似度确定各个评估权重,从而可以考 虑到不同语种对之间的语种差异,有效避免不同语种对的翻译评估准确度差异较大的情况,进而保证不同语种对的翻译评估准确度。The technical solution of this embodiment is to perform quality assessment on the target text based on at least two quality assessment indicators and the source text in the translated text pair to be assessed, determine the assessment results corresponding to each quality assessment indicator, and evaluate the target text based on the source language and the source text. The language similarity between the target languages determines the evaluation weight corresponding to each quality evaluation index, so as to fuse the evaluation results based on each evaluation weight to determine the target evaluation result of the translated text pair, so that at least two different The evaluation results corresponding to the quality evaluation indicators are fused to comprehensively evaluate the translation quality to avoid biased evaluation results. Each evaluation weight is determined based on the language similarity between the source language and the target language, so that the translation quality can be evaluated. Taking into account the language differences between different language pairs, it can effectively avoid the situation of large differences in the accuracy of translation evaluation of different language pairs, thereby ensuring the accuracy of translation evaluation of different language pairs.
可选地,质量评估指标包括:流畅度评估指标和忠实度评估指标;评估结果确定模块320,设置为:Optionally, the quality evaluation indicators include: fluency evaluation indicators and fidelity evaluation indicators; the evaluation result determination module 320 is set to:
基于至少一个预设流畅度评估模型和源文本,对目标文本进行流畅度评估,确定流畅度评估指标对应的评估结果;基于至少一个预设忠实度评估模型和源文本,对目标文本进行忠实度评估,确定忠实度评估指标对应的评估结果。Based on at least one preset fluency evaluation model and the source text, perform a fluency evaluation on the target text and determine the evaluation results corresponding to the fluency evaluation indicators; based on at least one preset fidelity evaluation model and the source text, perform a fidelity evaluation on the target text Evaluation, determine the evaluation results corresponding to the loyalty evaluation indicators.
可选地,预设流畅度评估模型包括:COMET-MQM跨语种多维质量模型、COMET-QE跨语种质量评估模型和BLEURT双语评估替代模型中的至少一种;Optionally, the preset fluency assessment model includes: at least one of the COMET-MQM cross-language multi-dimensional quality model, the COMET-QE cross-language quality assessment model and the BLEURT bilingual assessment alternative model;
预设忠实度评估模型包括:OpenKiwi评估模型和Yisi-2语义评估模型中的至少一种。The preset loyalty evaluation model includes: at least one of the OpenKiwi evaluation model and the Yisi-2 semantic evaluation model.
可选地,评估权重确定模块330,包括:Optionally, the evaluation weight determination module 330 includes:
语种相似度输入单元,设置为将源语种与目标语种之间的语种相似度输入至预设网络模型中,预设网络模型是预先基于翻译样本对数据和标签评估结果进行训练获得的;The language similarity input unit is configured to input the language similarity between the source language and the target language into a preset network model. The preset network model is obtained by training the data and label evaluation results based on translation samples in advance;
评估权重确定单元,设置为根据预设网络模型的输出,确定每种质量评估指标对应的评估权重。The evaluation weight determination unit is configured to determine the evaluation weight corresponding to each quality evaluation index based on the output of the preset network model.
可选地,在质量评估指标包括流畅度评估指标和忠实度评估指标时,评估权重确定单元,设置为:Optionally, when the quality evaluation index includes a fluency evaluation index and a fidelity evaluation index, the evaluation weight determination unit is set to:
根据预设网络模型的输出,确定流畅度评估指标对应的评估权重;基于流畅度评估指标对应的评估权重,确定出忠实度评估指标对应的评估权重。According to the output of the preset network model, the evaluation weight corresponding to the fluency evaluation index is determined; based on the evaluation weight corresponding to the fluency evaluation index, the evaluation weight corresponding to the fidelity evaluation index is determined.
可选地,该装置还包括:Optionally, the device also includes:
语种相似度确定模块,设置为:在基于源语种与目标语种之间的语种相似度,确定每种质量评估指标对应的评估权重之前,基于预设多语种模型,根据源语种对应的源语料库和目标语种对应的目标语料库,确定源语种对应的源语种表征向量和目标语种对应的目标语种表征向量;基于源语种表征向量和目标 语种表征向量,确定源语种与目标语种之间的语言相似度。The language similarity determination module is set as follows: before determining the evaluation weight corresponding to each quality evaluation indicator based on the language similarity between the source language and the target language, based on the preset multilingual model, based on the source corpus corresponding to the source language and The target corpus corresponding to the target language determines the source language representation vector corresponding to the source language and the target language representation vector corresponding to the target language; based on the source language representation vector and the target Language representation vector determines the language similarity between the source language and the target language.
可选地,语种相似度确定模块,设置为:Optionally, the language similarity determination module is set to:
将源语种对应的源语料库中的每个源文本输入至预设多语种模型中,确定每个源文本对应的源语言表征向量,并基于多个源语言表征向量,确定源语种对应的源语种表征向量;将目标语种对应的目标语料库中的每个目标文本输入至预设多语种模型中,确定每个目标文本对应的目标语言表征向量,并基于多个目标语言表征向量,确定目标语种对应的目标语种表征向量。Input each source text in the source corpus corresponding to the source language into the preset multilingual model, determine the source language representation vector corresponding to each source text, and determine the source language corresponding to the source language based on multiple source language representation vectors Representation vector; input each target text in the target corpus corresponding to the target language into the preset multilingual model, determine the target language representation vector corresponding to each target text, and determine the target language correspondence based on multiple target language representation vectors The target language representation vector.
可选地,语种相似度确定模块,还设置为:对多个源语言表征向量进行平均处理,获得的平均向量确定为源语种对应的源语种表征向量。Optionally, the language similarity determination module is also configured to average multiple source language representation vectors, and the obtained average vector is determined to be the source language representation vector corresponding to the source language.
本申请实施例所提供的机器翻译质量评估装置可执行本申请任意实施例所提供的机器翻译质量评估方法,具备执行机器翻译质量评估方法相应的功能模块。The machine translation quality assessment device provided by the embodiments of this application can execute the machine translation quality assessment method provided by any embodiment of this application, and has corresponding functional modules for executing the machine translation quality assessment method.
值得注意的是,上述机器翻译质量评估装置的实施例中,所包括的各个单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,各功能单元的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。It is worth noting that in the above embodiments of the machine translation quality assessment device, the various units and modules included are only divided according to functional logic, but are not limited to the above divisions, as long as the corresponding functions can be realized; in addition, , the specific names of each functional unit are only for the convenience of distinguishing each other, and are not used to limit the scope of protection of this application.
图4为本申请实施例提供的一种电子设备的结构示意图。图4示出了适于用来实现本申请实施方式的示例性电子设备12的框图。图4显示的电子设备12仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。FIG. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present application. 4 illustrates a block diagram of an exemplary electronic device 12 suitable for implementing embodiments of the present application. The electronic device 12 shown in FIG. 4 is only an example and should not bring any limitations to the functions and scope of use of the embodiments of the present application.
如图4所示,电子设备12以通用计算设备的形式表现。电子设备12的组件可以包括但不限于:至少一个处理器或者处理单元16,系统存储器28,连接不同系统组件(包括系统存储器28和处理单元16)的总线18。As shown in Figure 4, electronic device 12 is embodied in the form of a general-purpose computing device. The components of electronic device 12 may include, but are not limited to: at least one processor or processing unit 16, system memory 28, and a bus 18 connecting various system components (including system memory 28 and processing unit 16).
总线18表示几类总线结构中的一种或多种,包括存储器总线或者存储器控制器,外围总线,图形加速端口,处理器或者使用多种总线结构中的任意总线结构的局域总线。举例来说,这些体系结构包括但不限于工业标准体系结构 (Industry Standard Architecture,ISA)总线,微通道体系结构(Micro Channel Architecture,MCA)总线,增强型ISA总线、视频电子标准协会(Video Electronics Standards Association,VESA)局域总线以及外围组件互连(Peripheral Component Interconnect,PCI)总线。Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a graphics accelerated port, a processor, or a local bus using any of a variety of bus structures. For example, these architectures include, but are not limited to, industry standard architectures (Industry Standard Architecture, ISA) bus, Micro Channel Architecture (MCA) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (Peripheral Component Interconnect, PCI) bus.
电子设备12典型地包括多种计算机系统可读介质。这些介质可以是任何能够被电子设备12访问的可用介质,包括易失性和非易失性介质,可移动的和不可移动的介质。Electronic device 12 typically includes a variety of computer system readable media. These media can be any available media that can be accessed by electronic device 12, including volatile and nonvolatile media, removable and non-removable media.
系统存储器28可以包括易失性存储器形式的计算机系统可读介质,例如随机存取存储器(Random Access Memory,RAM)30和/或高速缓存存储器32。电子设备12可以进一步包括其它可移动/不可移动的、易失性/非易失性计算机系统存储介质。仅作为举例,存储系统34可以用于读写不可移动的、非易失性磁介质(图4未显示,通常称为“硬盘驱动器”)。尽管图4中未示出,可以提供用于对可移动非易失性磁盘(例如“软盘”)读写的磁盘驱动器,以及对可移动非易失性光盘(例如便携式紧凑磁盘只读存储器(Compact Disc-Read Only Memory,CD-ROM),数字视盘(Digital Video Disc-Read Only Memory,DVD-ROM)或者其它光介质)读写的光盘驱动器。在这些情况下,每个驱动器可以通过至少一个数据介质接口与总线18相连。系统存储器28可以包括至少一个程序产品,该程序产品具有一组(例如至少一个)程序模块,这些程序模块被配置以执行本申请各实施例的功能。System memory 28 may include computer system readable media in the form of volatile memory, such as random access memory (RAM) 30 and/or cache memory 32. Electronic device 12 may further include other removable/non-removable, volatile/non-volatile computer system storage media. By way of example only, storage system 34 may be used to read and write to non-removable, non-volatile magnetic media (not shown in Figure 4, commonly referred to as a "hard drive"). Although not shown in FIG. 4, a disk drive may be provided for reading and writing to removable non-volatile disks (e.g., "floppy disks"), and for removable non-volatile optical disks (e.g., Portable Compact Disk Read-Only Memory). Compact Disc-Read Only Memory, CD-ROM), digital video disk (Digital Video Disc-Read Only Memory, DVD-ROM) or other optical media) read and write optical disc drive. In these cases, each drive may be connected to bus 18 via at least one data media interface. System memory 28 may include at least one program product having a set (eg, at least one) of program modules configured to perform the functions of various embodiments of the present application.
具有一组(至少一个)程序模块42的程序/实用工具40,可以存储在例如系统存储器28中,这样的程序模块42包括但不限于操作系统、至少一个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。程序模块42通常执行本申请所描述的实施例中的功能和/或方法。A program/utility 40 having a set of (at least one) program modules 42, including but not limited to an operating system, at least one application program, other program modules, and program data, may be stored, for example, in system memory 28. Each of these examples, or some combination, may include the implementation of a network environment. Program modules 42 generally perform functions and/or methods in the embodiments described herein.
电子设备12也可以与至少一个外部设备14(例如键盘、指向设备、显示器24等)通信,还可与至少一个使得用户能与该电子设备12交互的设备通信,和 /或与使得该电子设备12能与至少一个其它计算设备进行通信的任何设备(例如网卡,调制解调器等等)通信。这种通信可以通过输入/输出(Input/Output,I/O)接口22进行。并且,电子设备12还可以通过网络适配器20与至少一个网络(例如局域网(Local Area Network,LAN),广域网(Wide Area Network,WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器20通过总线18与电子设备12的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备12使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、磁盘阵列(Redundant Arrays of Independent Disks,RAID)系统、磁带驱动器以及数据备份存储系统等。Electronic device 12 may also communicate with at least one external device 14 (e.g., keyboard, pointing device, display 24, etc.) and with at least one device that enables a user to interact with electronic device 12, and /or communicate with any device (eg, network card, modem, etc.) that enables the electronic device 12 to communicate with at least one other computing device. This communication may occur through an input/output (I/O) interface 22 . Moreover, the electronic device 12 can also communicate with at least one network (such as a local area network (LAN), a wide area network (WAN) and/or a public network such as the Internet) through the network adapter 20 . As shown, network adapter 20 communicates with other modules of electronic device 12 via bus 18 . It should be understood that, although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, redundant arrays of Independent Disks, RAID) systems, tape drives, and data backup storage systems, etc.
处理单元16通过运行存储在系统存储器28中的程序,从而执行各种功能应用以及数据处理,例如实现本发实施例所提供的一种机器翻译质量评估方法步骤,该方法包括:The processing unit 16 executes various functional applications and data processing by running programs stored in the system memory 28, for example, implementing the steps of a machine translation quality assessment method provided by the embodiment of the present invention. The method includes:
获取待评估的翻译文本对,翻译文本对包括源语种对应的源文本和翻译后的目标语种对应的目标文本;Obtain a translation text pair to be evaluated. The translation text pair includes a source text corresponding to the source language and a translated target text corresponding to the target language;
基于至少两种质量评估指标和源文本,对目标文本进行质量评估,确定每种质量评估指标对应的评估结果;Based on at least two quality assessment indicators and the source text, perform a quality assessment on the target text and determine the assessment results corresponding to each quality assessment indicator;
基于源语种与目标语种之间的语种相似度,确定每种质量评估指标对应的评估权重;Based on the language similarity between the source language and the target language, determine the evaluation weight corresponding to each quality evaluation indicator;
基于至少两个评估权重,对至少两个评估结果进行融合处理,确定翻译文本对的目标评估结果。Based on at least two evaluation weights, at least two evaluation results are fused to determine the target evaluation result of the translated text pair.
当然,本领域技术人员可以理解,处理器还可以实现本申请任意实施例所提供的机器翻译质量评估方法的技术方案。Of course, those skilled in the art can understand that the processor can also implement the technical solution of the machine translation quality assessment method provided by any embodiment of the present application.
本实施例提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本申请任意实施例所提供的机器翻译质量评估方法步骤,该方法包括: This embodiment provides a computer-readable storage medium on which a computer program is stored. When the program is executed by a processor, the machine translation quality assessment method steps as provided in any embodiment of the present application are implemented. The method includes:
获取待评估的翻译文本对,翻译文本对包括源语种对应的源文本和翻译后的目标语种对应的目标文本;Obtain a translation text pair to be evaluated. The translation text pair includes a source text corresponding to the source language and a translated target text corresponding to the target language;
基于至少两种质量评估指标和源文本,对目标文本进行质量评估,确定每种质量评估指标对应的评估结果;Based on at least two quality assessment indicators and the source text, perform a quality assessment on the target text and determine the assessment results corresponding to each quality assessment indicator;
基于源语种与目标语种之间的语种相似度,确定每种质量评估指标对应的评估权重;Based on the language similarity between the source language and the target language, determine the evaluation weight corresponding to each quality evaluation index;
基于至少两个评估权重,对至少两个评估结果进行融合处理,确定翻译文本对的目标评估结果。Based on at least two evaluation weights, at least two evaluation results are fused to determine the target evaluation result of the translated text pair.
本申请实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是但不限于:电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有至少一个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM(Erasable Programmable Read-Only Memory)或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。The computer storage medium in the embodiment of the present application may be any combination of one or more computer-readable media. The computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium. The computer-readable storage medium may be, for example, but not limited to: an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination thereof. More specific examples (non-exhaustive list) of computer readable storage media include: an electrical connection having at least one conductor, a portable computer disk, a hard disk, random access memory (RAM), read only memory (ROM), erasable EPROM (Erasable Programmable Read-Only Memory) or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. As used herein, a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device.
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。A computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不 限于:无线、电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including but not Limited to: wireless, wires, optical cables, radio frequency (Radio Frequency, RF), etc., or any suitable combination of the above.
可以以一种或多种程序设计语言或其组合来编写用于执行本申请操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言,诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for performing operations of the present application may be written in one or more programming languages, including object-oriented programming languages such as Java, Smalltalk, C++, and conventional Procedural programming language—such as "C" or a similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In situations involving remote computers, the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as an Internet service provider through the Internet). connect).
本领域普通技术人员应该明白,上述的本申请的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个计算装置上,或者分布在多个计算装置所组成的网络上,可选地,他们可以用计算机装置可执行的程序代码来实现,从而可以将它们存储在存储装置中由计算装置来执行,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本申请不限制于任何特定的硬件和软件的结合。Those of ordinary skill in the art should understand that the above-mentioned modules or steps of the present application can be implemented using general-purpose computing devices. They can be concentrated on a single computing device, or distributed on a network composed of multiple computing devices. Alternatively, they can be implemented with program codes executable by a computer device, so that they can be stored in a storage device and executed by the computing device, or they can be made into individual integrated circuit modules, or multiple modules among them. Or the steps are made into a single integrated circuit module. As such, the application is not limited to any specific combination of hardware and software.
注意,上述仅为本申请的可选实施例及所运用技术原理。本领域技术人员会理解,本申请不限于这里所述的特定实施例,对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本申请的保护范围。因此,虽然通过以上实施例对本申请进行了较为详细的说明,但是本申请不仅仅限于以上实施例,在不脱离本申请构思的情况下,还可以包括更多其他等效实施例,而本申请的范围由所附的权利要求范围决定。 Note that the above are only optional embodiments and applied technical principles of the present application. Those skilled in the art will understand that the present application is not limited to the specific embodiments described herein, and that various obvious changes, readjustments and substitutions can be made by those skilled in the art without departing from the scope of the present application. Therefore, although the present application has been described in detail through the above embodiments, the present application is not limited to the above embodiments, and may also include more other equivalent embodiments without departing from the concept of the present application, and the present application The scope is determined by the scope of the appended claims.

Claims (10)

  1. 一种机器翻译质量评估方法,包括:A machine translation quality assessment method, including:
    获取待评估的翻译文本对,所述翻译文本对包括源语种对应的源文本和翻译后的目标语种对应的目标文本;Obtain a translation text pair to be evaluated, the translation text pair includes a source text corresponding to the source language and a translated target text corresponding to the target language;
    基于至少两种质量评估指标和所述源文本,对所述目标文本进行质量评估,确定每种所述质量评估指标对应的评估结果;Perform a quality assessment on the target text based on at least two quality assessment indicators and the source text, and determine the assessment results corresponding to each of the quality assessment indicators;
    基于所述源语种与所述目标语种之间的语种相似度,确定每种所述质量评估指标对应的评估权重;Based on the language similarity between the source language and the target language, determine the evaluation weight corresponding to each of the quality evaluation indicators;
    基于至少两个评估权重,对至少两个评估结果进行融合处理,确定所述翻译文本对的目标评估结果。Based on at least two evaluation weights, at least two evaluation results are fused to determine the target evaluation result of the translated text pair.
  2. 根据权利要求1所述的方法,其中,所述至少两种质量评估指标包括:流畅度评估指标和忠实度评估指标;The method according to claim 1, wherein the at least two quality assessment indicators include: a fluency assessment indicator and a fidelity assessment indicator;
    所述基于至少两种质量评估指标和所述源文本,对所述目标文本进行质量评估,确定每种所述质量评估指标对应的评估结果,包括:Performing a quality assessment on the target text based on at least two quality assessment indicators and the source text, and determining an assessment result corresponding to each of the quality assessment indicators, including:
    基于至少一个预设流畅度评估模型和所述源文本,对所述目标文本进行流畅度评估,确定所述流畅度评估指标对应的评估结果;Based on at least one preset fluency evaluation model and the source text, perform a fluency evaluation on the target text, and determine the evaluation results corresponding to the fluency evaluation indicators;
    基于至少一个预设忠实度评估模型和所述源文本,对所述目标文本进行忠实度评估,确定所述忠实度评估指标对应的评估结果。Based on at least one preset fidelity evaluation model and the source text, a fidelity evaluation is performed on the target text and an evaluation result corresponding to the fidelity evaluation index is determined.
  3. 根据权利要求1所述的方法,其中,所述基于所述源语种与所述目标语种之间的语种相似度,确定每种所述质量评估指标对应的评估权重,包括:The method according to claim 1, wherein determining the evaluation weight corresponding to each of the quality evaluation indicators based on the language similarity between the source language and the target language includes:
    将所述源语种与所述目标语种之间的语种相似度输入至预设网络模型中,所述预设网络模型是预先基于翻译样本对数据和标签评估结果进行训练获得的;Input the language similarity between the source language and the target language into a preset network model, which is obtained by training data and label evaluation results based on translation samples in advance;
    根据所述预设网络模型的输出,确定每种所述质量评估指标对应的评估权重。According to the output of the preset network model, the evaluation weight corresponding to each of the quality evaluation indicators is determined.
  4. 根据权利要求3所述的方法,其中,在所述至少两种质量评估指标包括流畅度评估指标和忠实度评估指标时,根据所述预设网络模型的输出,确定每种所述质量评估指标对应的评估权重,包括: The method according to claim 3, wherein when the at least two quality evaluation indicators include a fluency evaluation index and a fidelity evaluation index, each of the quality evaluation indicators is determined according to the output of the preset network model. The corresponding evaluation weights include:
    根据所述预设网络模型的输出,确定流畅度评估指标对应的评估权重;Determine the evaluation weight corresponding to the fluency evaluation index according to the output of the preset network model;
    基于所述流畅度评估指标对应的评估权重,确定出忠实度评估指标对应的评估权重。Based on the evaluation weight corresponding to the fluency evaluation index, the evaluation weight corresponding to the fidelity evaluation index is determined.
  5. 根据权利要求1-4任一项所述的方法,在基于所述源语种与所述目标语种之间的语种相似度,确定每种所述质量评估指标对应的评估权重之前,还包括:The method according to any one of claims 1 to 4, before determining the evaluation weight corresponding to each of the quality evaluation indicators based on the language similarity between the source language and the target language, further comprising:
    基于预设多语种模型,根据所述源语种对应的源语料库和所述目标语种对应的目标语料库,确定所述源语种对应的源语种表征向量和所述目标语种对应的目标语种表征向量;Based on the preset multilingual model, according to the source corpus corresponding to the source language and the target corpus corresponding to the target language, determine the source language representation vector corresponding to the source language and the target language representation vector corresponding to the target language;
    基于所述源语种表征向量和目标语种表征向量,确定源语种与目标语种之间的语言相似度。Based on the source language representation vector and the target language representation vector, the language similarity between the source language and the target language is determined.
  6. 根据权利要求5所述的方法,其中,所述基于预设多语种模型,根据所述源语种对应的源语料库和所述目标语种对应的目标语料库,确定所述源语种对应的源语种表征向量和所述目标语种对应的目标语种表征向量,包括:The method according to claim 5, wherein the source language representation vector corresponding to the source language is determined based on a preset multilingual model and a source corpus corresponding to the source language and a target corpus corresponding to the target language. The target language representation vector corresponding to the target language includes:
    将所述源语种对应的源语料库中的每个源文本输入至预设多语种模型中,确定每个源文本对应的源语言表征向量,并基于多个源语言表征向量,确定所述源语种对应的源语种表征向量;Input each source text in the source corpus corresponding to the source language into the preset multilingual model, determine the source language representation vector corresponding to each source text, and determine the source language based on multiple source language representation vectors The corresponding source language representation vector;
    将所述目标语种对应的目标语料库中的每个目标文本输入至预设多语种模型中,确定每个目标文本对应的目标语言表征向量,并基于多个目标语言表征向量,确定所述目标语种对应的目标语种表征向量。Input each target text in the target corpus corresponding to the target language into the preset multilingual model, determine the target language representation vector corresponding to each target text, and determine the target language based on multiple target language representation vectors The corresponding target language representation vector.
  7. 根据权利要求6所述的方法,其中,所述基于多个源语言表征向量,确定所述源语种对应的源语种表征向量,包括:The method according to claim 6, wherein determining the source language representation vector corresponding to the source language based on a plurality of source language representation vectors includes:
    对所述多个源语言表征向量进行平均处理,获得的平均向量确定为所述源语种对应的源语种表征向量。The multiple source language representation vectors are averaged, and the obtained average vector is determined as the source language representation vector corresponding to the source language.
  8. 一种机器翻译质量评估装置,包括:A machine translation quality assessment device, including:
    翻译文本对获取模块,设置为获取待评估的翻译文本对,所述翻译文本对 包括源语种对应的源文本和翻译后的目标语种对应的目标文本;The translation text pair acquisition module is configured to acquire the translation text pair to be evaluated, and the translation text pair Including the source text corresponding to the source language and the translated target text corresponding to the target language;
    评估结果确定模块,设置为基于至少两种质量评估指标和所述源文本,对所述目标文本进行质量评估,确定每种所述质量评估指标对应的评估结果;An evaluation result determination module is configured to perform a quality evaluation on the target text based on at least two quality evaluation indicators and the source text, and determine the evaluation results corresponding to each of the quality evaluation indicators;
    评估权重确定模块,设置为基于所述源语种与所述目标语种之间的语种相似度,确定每种所述质量评估指标对应的评估权重;An evaluation weight determination module is configured to determine the evaluation weight corresponding to each of the quality evaluation indicators based on the language similarity between the source language and the target language;
    评估结果融合模块,设置为基于至少两个评估权重,对至少两个评估结果进行融合处理,确定所述翻译文本对的目标评估结果。The evaluation result fusion module is configured to perform fusion processing on at least two evaluation results based on at least two evaluation weights, and determine the target evaluation result of the translated text pair.
  9. 一种电子设备,包括:An electronic device including:
    至少一个处理器;at least one processor;
    存储器,设置为存储至少一个程序;a memory configured to store at least one program;
    当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-7中任一所述的机器翻译质量评估方法。When the at least one program is executed by the at least one processor, the at least one processor is caused to implement the machine translation quality evaluation method according to any one of claims 1-7.
  10. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1-7中任一所述的机器翻译质量评估方法。 A computer-readable storage medium. A computer program is stored on the computer-readable storage medium. When the computer program is executed by a processor, the machine translation quality assessment method as described in any one of claims 1-7 is implemented.
PCT/CN2023/112135 2022-08-12 2023-08-10 Machine translation quality assessment method and apparatus, device, and storage medium WO2024032691A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210970061.5A CN115310460A (en) 2022-08-12 2022-08-12 Machine translation quality evaluation method, device, equipment and storage medium
CN202210970061.5 2022-08-12

Publications (1)

Publication Number Publication Date
WO2024032691A1 true WO2024032691A1 (en) 2024-02-15

Family

ID=83862779

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/112135 WO2024032691A1 (en) 2022-08-12 2023-08-10 Machine translation quality assessment method and apparatus, device, and storage medium

Country Status (2)

Country Link
CN (1) CN115310460A (en)
WO (1) WO2024032691A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115310460A (en) * 2022-08-12 2022-11-08 京东科技信息技术有限公司 Machine translation quality evaluation method, device, equipment and storage medium
CN116341561B (en) * 2023-03-27 2024-02-02 京东科技信息技术有限公司 Voice sample data generation method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170161264A1 (en) * 2015-12-07 2017-06-08 Linkedin Corporation Generating multi-anguage social network user profiles by translation
CN107357783A (en) * 2017-07-04 2017-11-17 桂林电子科技大学 A kind of English translation mass analysis method of translator of Chinese into English
CN111027331A (en) * 2019-12-05 2020-04-17 百度在线网络技术(北京)有限公司 Method and apparatus for evaluating translation quality
CN114004238A (en) * 2021-09-23 2022-02-01 昆明理工大学 Chinese-transcendental neural machine translation quality estimation method integrating language differentiation characteristics
CN115310460A (en) * 2022-08-12 2022-11-08 京东科技信息技术有限公司 Machine translation quality evaluation method, device, equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170161264A1 (en) * 2015-12-07 2017-06-08 Linkedin Corporation Generating multi-anguage social network user profiles by translation
CN107357783A (en) * 2017-07-04 2017-11-17 桂林电子科技大学 A kind of English translation mass analysis method of translator of Chinese into English
CN111027331A (en) * 2019-12-05 2020-04-17 百度在线网络技术(北京)有限公司 Method and apparatus for evaluating translation quality
CN114004238A (en) * 2021-09-23 2022-02-01 昆明理工大学 Chinese-transcendental neural machine translation quality estimation method integrating language differentiation characteristics
CN115310460A (en) * 2022-08-12 2022-11-08 京东科技信息技术有限公司 Machine translation quality evaluation method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN115310460A (en) 2022-11-08

Similar Documents

Publication Publication Date Title
Wang et al. Textflint: Unified multilingual robustness evaluation toolkit for natural language processing
US10402498B2 (en) Method and system for automatic management of reputation of translators
WO2024032691A1 (en) Machine translation quality assessment method and apparatus, device, and storage medium
US9916306B2 (en) Statistical linguistic analysis of source content
US10747962B1 (en) Artificial intelligence system using phrase tables to evaluate and improve neural network based machine translation
CN109599095B (en) Method, device and equipment for marking voice data and computer storage medium
US9575965B2 (en) Translation assessment based on computer-generated subjective translation quality score
US9766868B2 (en) Dynamic source code generation
US11308286B2 (en) Method and device for retelling text, server, and storage medium
CN109558604B (en) Machine translation method and device, electronic equipment and storage medium
JP7159248B2 (en) Review information processing method, apparatus, computer equipment and medium
CN111325038B (en) Translation training data generation method, device, computer equipment and storage medium
US10032448B1 (en) Domain terminology expansion by sensitivity
US9311299B1 (en) Weakly supervised part-of-speech tagging with coupled token and type constraints
CN109408834B (en) Auxiliary machine translation method, device, equipment and storage medium
WO2021184769A1 (en) Operation method and apparatus for neural network text translation model, and device and medium
Xu et al. Instructscore: Towards explainable text generation evaluation with automatic feedback
CN108932218A (en) A kind of example extended method, device, equipment and medium
CN112507695A (en) Text error correction model establishing method, device, medium and electronic equipment
CN111597800A (en) Method, device, equipment and storage medium for obtaining synonyms
US10043511B2 (en) Domain terminology expansion by relevancy
WO2020052060A1 (en) Method and apparatus for generating correction statement
US20150088486A1 (en) Written language learning using an enhanced input method editor (ime)
CN110807334B (en) Text processing method, device, medium and computing equipment
WO2021072864A1 (en) Text similarity acquisition method and apparatus, and electronic device and computer-readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23851912

Country of ref document: EP

Kind code of ref document: A1