WO2022116821A1

WO2022116821A1 - Translation method and apparatus employing multi-language machine translation model, device, and medium

Info

Publication number: WO2022116821A1
Application number: PCT/CN2021/131090
Authority: WO
Inventors: 赵程绮; 朱耀明; 王明轩; 封江涛; 李磊
Original assignee: 北京有竹居网络技术有限公司
Priority date: 2020-12-04
Filing date: 2021-11-17
Publication date: 2022-06-09
Also published as: CN112380876A

Abstract

Provided are a translation method and apparatus employing a multi-language machine translation model, a device, and a medium. The translation method employing a multi-language machine translation model comprises: acquiring an original phrase to be translated and translation language information of the original phrase (S101); determining a target adapter corresponding to the translation language information of the original phrase, wherein the target adapter is used to correct translation errors of a preset multi-language machine translation model (S102); and translating the original phrase on the basis of the multi-language machine translation model and the target adapter, and obtaining a target phrase (S103).

Description

Translation method, device, device and medium based on multilingual machine translation model

This application claims the priority of the Chinese patent application with application number 202011409340.1 filed with the China Patent Office on December 04, 2020, the entire contents of which are incorporated herein by reference.

technical field

The present disclosure relates to the field of computer technology, for example, to a translation method, apparatus, device and medium based on a multilingual machine translation model.

Background technique

Machine Translation (MT) is one of the core tasks in natural language processing, which aims to use computer programs to translate one natural language into another natural language.

Traditional machine translation models are generally bilingual machine translation models, which can handle translation in one language direction, such as translating English into Chinese. When the number of languages is large, a large number of bilingual machine translation models need to be trained to achieve pairwise translation between each pair of natural languages. The multilingual machine translation model gradually replaces the bilingual machine translation model and becomes a commonly used machine translation model. one.

However, under the same parameter configuration and model architecture, the performance of the multilingual machine translation model is often inferior to that of the bilingual machine translation model, resulting in large translation errors in the translation results output by the multilingual machine translation model.

SUMMARY OF THE INVENTION

The present disclosure provides a translation method, apparatus, device and medium based on a multilingual machine translation model, so as to improve the accuracy of the translation result output by the multilingual machine translation model.

The present disclosure provides a translation method based on a multilingual machine translation model, including:

obtaining the original sentence to be translated and the translated language information of the original sentence;

determining a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct translation errors of a preset multilingual machine translation model;

The original sentence is translated based on the multilingual machine translation model and the target adapter to obtain a target sentence.

The present disclosure also provides a translation device based on a multilingual machine translation model, including:

A statement acquisition module, configured to acquire the original statement to be translated and the translation language information of the original statement;

an adapter determination module, configured to determine a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct translation errors of a preset multilingual machine translation model;

A translation module, configured to translate the original sentence based on the multilingual machine translation model and the target adapter to obtain a target sentence.

The present disclosure also provides an electronic device, comprising:

one or more processors;

memory, arranged to store one or more programs;

When the one or more programs are executed by the one or more processors, the one or more processors implement the above-mentioned translation method based on a multilingual machine translation model.

The present disclosure also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the above-mentioned translation method based on a multilingual machine translation model.

Description of drawings

1 is a schematic flowchart of a translation method based on a multilingual machine translation model provided by an embodiment of the present disclosure;

FIG. 2 is a schematic structural diagram of an adapter according to an embodiment of the present disclosure;

3 is a schematic structural diagram of a multilingual machine translation model according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a connection relationship of a target adapter according to an embodiment of the present disclosure;

5 is a schematic flowchart of another translation method based on a multilingual machine translation model provided by an embodiment of the present disclosure;

6 is a structural block diagram of a translation device based on a multilingual machine translation model provided by an embodiment of the present disclosure;

FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

Detailed ways

Embodiments of the present disclosure will be described below with reference to the accompanying drawings. Although some embodiments of the present disclosure are shown in the drawings, the present disclosure may, however, be embodied in various forms and should not be construed as limited to the embodiments set forth herein, which are provided for a more thorough and complete understanding this disclosure. The figures and examples of the present disclosure are for illustrative purposes only.

The multiple steps described in the method embodiments of the present disclosure may be performed in different orders, and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this regard.

As used herein, the term "including" and variations thereof are open-ended inclusions, ie, "including but not limited to". The term "based on" is "based at least in part on." The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions of other terms will be given in the description below.

Concepts such as "first" and "second" mentioned in the present disclosure are only used to distinguish different devices, modules or units, and are not used to limit the order or interdependence of functions performed by these devices, modules or units relation.

Modifications of "a" and "a plurality" mentioned in the present disclosure are illustrative rather than limiting, and those skilled in the art should understand that unless the context indicates otherwise, they should be construed as "one or more".

The names of messages or information exchanged between multiple devices in the embodiments of the present disclosure are only for illustrative purposes, and are not intended to limit the scope of these messages or information.

FIG. 1 is a schematic flowchart of a translation method based on a multilingual machine translation model according to an embodiment of the present disclosure. The method may be performed by a translation apparatus based on a multilingual machine translation model, wherein the apparatus may be implemented by software and/or hardware, and may be configured in an electronic device, for example, the apparatus may be configured in a mobile phone, a tablet computer or a computer device . As shown in FIG. 1, the translation method based on the multilingual machine translation model provided by this embodiment may include:

S101. Acquire an original sentence to be translated and translation language information of the original sentence.

The original sentence is the sentence that needs to be translated this time, which can be input by the user through an input device such as a keyboard or recognized by text recognition or voice recognition, that is, when the user needs to translate a sentence, it can be input through text or voice. The input method is to input the sentence into the electronic device, and also can take a picture containing the sentence or obtain the text containing the sentence, and import the picture or text into the electronic device. Correspondingly, the translation method based on the multilingual machine translation model provided in this embodiment can translate the text or voice input by the user, and can also translate the sentences contained in the pictures or text imported by the user. When translating a sentence in a voice or picture, the sentence can be converted into an original sentence in text form before translation. The following takes the user inputting the original sentence through text input as an example for description. The translation language information of the original sentence can be understood as the translation direction information during this translation, which can include the original language information of this translation (that is, the language information to which the original sentence to be translated belongs) and the target language information (that is, the original The language information to which the target sentence to which the sentence is translated belongs), the language information may be, for example, English, Chinese, or German.

Exemplarily, when a user needs to translate an original sentence, the user inputs the original sentence, the language information of the original language to which the original sentence belongs, and the language information of the target language to which the original sentence needs to be translated into the electronic device to generate a The translation instruction for the original sentence; correspondingly, when receiving the translation instruction for the original sentence, the electronic device obtains the original sentence, and determines the translation language information of the original sentence, such as the original language selected by the user on the translation page. The language information is determined as the original language information, and the language information of the target language selected by the user on the translation page is determined as the target translation language information.

S102. Determine a target adapter corresponding to the translation language information of the original sentence, where the target adapter is used to correct translation errors of a preset multilingual machine translation model.

In this embodiment, after a multilingual machine translation model is obtained by training, adapters for the multilingual machine translation model in different translation scenarios (that is, for different translation language information) can be set, and the multilingual machine translation model can be used for a translation scenario. When translating the original sentence below, the adapter corresponding to the translation scene is used to correct the translation error of the multilingual machine translation model, thereby improving the accuracy of the translation result output by the multilingual machine translation model. Moreover, since the parameter amount of the adapter is very small (the parameter amount is less than one-twentieth of the multilingual machine translation model, and the larger the multilingual machine translation model is, the smaller the ratio is), therefore, by configuring the adapter, correcting more The translation error of the language machine translation model is very small, which is easy to deploy.

After acquiring the translation language information of the original sentence, the electronic device can acquire the adapter corresponding to the translation language information from the preset multiple adapters according to the translation language information as the target adapter.

The number of adapters corresponding to a translation language information may be one or more, that is, in this embodiment, only one adapter corresponding to each translation language information may be set. When the original sentence is translated, only this adapter can be used to correct the translation error of the multilingual machine translation model; it is also possible to set up multiple adapters corresponding to each translation language information. When an original sentence is translated, the multiple adapters can be used to correct the translation errors of the multilingual machine translation model, thereby improving the accuracy of the translation result output by the multilingual machine translation model. The following is an example of this situation. Here, when the translation language information is different, the adapters used can be different; the structure of the adapter can be selected flexibly. An activation function may be configured between the feed layer and the second feedforward layer, and the activation function may be a Gaussian Error Linear Unit (GULU), as shown in Figure 2.

In this embodiment, the type of the multilingual machine translation model can be set as required, for example, the multilingual machine translation model can be a Transformer model, as shown in FIG. 3 (in FIG. 3 only one encoder and one decoder), the multilingual machine translation model may include at least one encoder and at least one decoder. In one embodiment, the multilingual machine translation model includes multiple encoders and multiple decoders, such as including 6 encoders and 6 decoders, and each encoder is provided with at least a self-attention layer and a feedforward layer Two encoder sub-layers, each decoder is provided with at least three decoder sub-layers: self-attention layer, encoding-decoding attention layer and feed-forward layer, multiple encoders are connected in series, and multiple decoders are connected in series, The feedforward layer of the last encoder is concatenated with the encoder-decoder attention layer of each decoder.

Exemplarily, when setting an adapter corresponding to a translation language information for a multilingual machine translation model, corresponding adapters may be set for the encoder and decoder in the language machine translation model; The encoder sub-layer and the decoder sub-layer are respectively set with corresponding adapters, which are not limited in this implementation. In order to reduce the number of adapters that need to be set for the multi-language machine translation model on the premise of ensuring the accuracy of the translation results output by the multi-language machine translation model, in this embodiment, at least one encoder sub-layer in each encoder may be Divide into one or more encoder sub-layer components, divide at least one decoder sub-layer in each decoder into one or more decoder sub-layer components, and divide for each encoder sub-layer component and each decoder sub-layer component The sub-level components of the device are respectively set with corresponding adapters. At this time, the multilingual machine translation model includes an encoder and a decoder, the encoder includes at least one encoder sub-layer component, and the encoder sub-layer component is composed of at least one encoder sub-layer; the decoding The encoder includes at least one decoder sub-layer component, the decoder sub-layer component is composed of at least one decoder sub-layer, and each encoder sub-layer component and each decoder sub-layer component are provided with different translation language information Corresponding different adapters. Among them, the values of the parameters in the adapters corresponding to different sub-layers can be different; the self-attention layer and the feed-forward layer in each encoder can separately form an encoder sub-layer component, and the self-attention layer in each decoder The force layer and the encoder-decoder attention layer can together form a decoder sub-layer component, and the feedforward layer in each decoder can independently form a decoder sub-layer component, as shown in Figure 4 (Figure 4 is only an exemplary out an encoder and a decoder).

In one embodiment, before the acquiring the original sentence to be translated and the translated language information of the original sentence, the method further includes: for each translated language information, acquiring a plurality of training samples consistent with the translated language information , and input the plurality of training samples into the multilingual machine translation model, so as to obtain the adapter corresponding to the translated language information by training.

In the above embodiment, the adapters corresponding to each sub-level component (including the encoder sub-level component and the decoder sub-level component) under different translation language information can be obtained through training. For the adapter corresponding to each sub-level component under each translation language information, after the training of the multi-language machine translation model is completed, the parameters in the multi-language machine translation model can be fixed, and the parameters in the multi-language machine translation model can be fixed for each level of the multi-language machine translation model. The layer component sets the original adapter, uses training samples consistent with the translation language information to train each original adapter, uses test samples to test the translation error of the multilingual machine translation model, and sets the translation error of the multilingual machine translation model to less than The adapter corresponding to each sub-level component at the preset error threshold is determined as its corresponding adapter under the translation language information.

S103. Translate the original sentence based on the multilingual machine translation model and the target adapter to obtain a target sentence.

The target sentence is the sentence translated from the original sentence.

Exemplarily, under a translation language information, when the multilingual machine translation model is only configured with an adapter corresponding to the multilingual machine translation model, the data output by the multilingual machine translation model can be obtained, and the target adapter can be used to The data is corrected to obtain the target sentence; when each encoder and each decoder in the multilingual machine translation model is configured with a corresponding adapter, after an encoder/decoder outputs data, the encoding can be used The adapter corresponding to the decoder/the decoder corrects the data, inputs the corrected data into the next layer, and determines the sentence output by the multilingual machine translation model as the target sentence; When each sub-level component in each encoder and each decoder is configured with a corresponding adapter, after the primary-level component outputs the original data, the adapter corresponding to the sub-level component can be used to correct the original data, and the The corrected target data is input into the next layer, and the sentence output by the multilingual machine translation model is determined as the target sentence, and the original sentence is translated based on the multilingual machine translation model and the target adapter, Obtaining the target sentence includes: using the multilingual machine translation model to translate the original sentence, and using the first target adapter of each sub-level component to correct the original output data of each sub-level component to A target sentence is obtained, each of the sub-level components including an encoder sub-level component and/or a decoder sub-level component.

The translation method based on the multilingual machine translation model provided by this embodiment acquires the original sentence to be translated and the translation language information of the original sentence, and determines the multilingual information corresponding to the translation language information of the original sentence and is used to correct the preset multi-language The target adapter of the translation error of the machine translation model, and based on the multilingual machine translation model and the target adapter, the original sentence is translated to obtain the target sentence. In this embodiment, by adopting the above technical solution and using an adapter to correct the translation error of the multilingual machine translation model, the accuracy of the translation result output by the multilingual translation model can be improved.

FIG. 5 is a schematic flowchart of another translation method based on a multilingual machine translation model provided by an embodiment of the present disclosure. The solution in this embodiment may be combined with one or more optional solutions in the foregoing embodiments. Optionally, the multilingual machine translation model is used to translate the original sentence, and the first target adapter of each sub-level component is used to correct the original output data of each sub-level component, so as to Obtaining the target sentence includes: determining the first sub-level component in the multi-language machine translation model as the current sub-level component according to the connection relationship of multiple sub-level components in the multi-language machine translation model, and obtaining the current sub-level component target input data of the sub-level component; determine the first target adapter of the current sub-level component as the current target adapter; input the target input data into the current sub-level component and the current target adapter to obtain the original output data of the current sub-level component and the current correction parameters output by the current target adapter; use the current correction parameters to correct the original output data to obtain the target output data of the current sub-level component; The target output data is determined as the target input data of the next-level component, and the next-level component is determined as the current sub-level component, and the operation of determining the target adapter of the current sub-level component as the current target adapter is returned, until Until there is no next-level component; when there is no next-level component, input the target output data of the current sub-level component into the next layer of the current sub-level component to obtain the target sentence.

Correspondingly, as shown in FIG. 5 , the translation method based on the multilingual machine translation model provided in this embodiment may include:

S201. Acquire an original sentence to be translated and translation language information of the original sentence.

S202. Determine a target adapter corresponding to the translation language information of the original sentence, where the target adapter is used to correct translation errors of a preset multilingual machine translation model.

S203. According to the connection relationship of multiple sub-level components in the multilingual machine translation model, determine the first sub-level component in the multi-language machine translation model as the current sub-level component, and obtain the current sub-level component. target input data.

The target input data can be understood as the data input into the current sub-level component.

Exemplarily, when translating the original sentence, the original sentence can be input into the multi-language machine translation model, and according to the connection relationship between the multiple layers of the multi-language machine translation model, the output of each layer to the previous layer is controlled in turn. The information is processed, and when the target output data of the previous layer of the first sub-level component in the multilingual machine translation model is obtained, the first sub-level component is determined as the current sub-level component, and its previous layer is determined. The target output data of is determined as the target input data of the current sub-level component.

S204. Determine the first target adapter of the current sub-layer component as the current target adapter.

The first target adapter can be understood as the adapter configured by the sub-level component and used for correcting the original output data of the sub-level component, that is, the adapter configured by the encoder sub-level component of the encoder or the decoder sub-level component of the decoder . In this embodiment, the target adapter corresponding to the translation language information of the original sentence may include the first target adapter configured by each sub-level component in the multilingual machine translation model; it may also include in the multilingual machine translation model The word embedding layer (including the input word embedding layer and the output word embedding layer) is configured with the second target adapter.

The first target adapter corresponding to the component identification information of the current sub-level component may be selected from the plurality of target adapters determined in S202 as the current target adapter according to the component identification information of the current sub-level component.

S205. Input the target input data into the current sub-level component and the current target adapter, so as to obtain the original output data of the current sub-level component and the current correction parameters output by the current target adapter.

The original output data of the current sub-level component can be understood as the data before correction obtained by the current sub-level component operating on its target input data. The current correction parameter can be understood as a parameter for correcting the original output data of the current sub-level component, which can be calculated by the current target adapter configured by the current sub-level component according to the target input data of the current sub-level component.

Exemplarily, after the current target adapter of the current sub-level component is determined, the target input data of the current sub-level component can be input into the current sub-level component and the current target adapter, and the data output by the current sub-level component can be obtained as the current sub-level component. The raw output data of the sub-level component, and the data output by the current target adapter are obtained as the current calibration parameters.

S206. Correct the original output data by using the current correction parameter to obtain the target output data of the current sub-layer component.

Exemplarily, the current correction parameters can be used to correct the original output data, for example, the original output data is corrected to the sum of the original output data and the current correction parameters, and the data obtained by correcting the original output data is determined as the current sub-layer component. target output data.

S207: Determine whether there is a next-level component, if there is a next-level component, execute S208; if there is no next-level component, execute S209.

Exemplarily, according to the connection relationship of multiple sub-level components in the multilingual machine translation model, it can be judged whether the next layer connected to the output end of the current sub-level component is located in the primary level component, if the output of the current sub-level component is located in the primary level component. If the next layer connected to the terminal is located in the primary layer component, it is determined that there is a next layer component, and the sublayer component to which the next layer belongs is determined as the next layer component.

If the output terminal of the current sub-layer component is connected to the input terminals of multiple sub-layers, the location of the multiple sub-layers connected to the output terminal of the current sub-layer component can be determined according to the data flow in the multilingual machine translation model. Among the sub-level components of , the sub-level component through which the data first flows is determined as the next-level component.

S208: Determine the target output data as the target input data of the next-level component, determine the next-level component as the current sub-level component, and return to executing S204.

S209: Input the target output data of the current sub-level component into the next layer of the current sub-level component to obtain a target sentence.

Exemplarily, as shown in Figure 4, when the output end of the last sub-level component in the multilingual machine translation model is connected to the output layer of the multi-language machine translation model, if the current sub-level component does not have the next level component, It means that the current sub-level component is the last sub-level component in the multilingual machine translation model. At this time, after obtaining the target output data of the current sub-level component, the electronic device can output the target output data to the multilingual machine translation model. In the output layer, the multilingual machine translation model can output the target sentence obtained by translating the original sentence through the output layer.

In one embodiment, the multilingual machine translation model further includes an input word embedding layer and an output word embedding layer, and the output of the input word embedding layer is connected to the first encoder sub-layer in the multilingual machine translation model. The input end of the component is connected, and the output end of the output word embedding layer is connected to the input end of the first decoder sub-layer component in the multilingual machine translation model.

In the above-mentioned embodiment, as shown in FIG. 3 , the multilingual machine translation model may also be provided with an input word embedding layer and an output word embedding layer, and the output end of the input word embedding layer may be the same as that of the first word embedding layer in the multilingual machine translation model. The input of the self-attention layer of an encoder is connected, and the output of the output word embedding layer can be connected to the input of the self-attention layer of the first decoder in the multilingual machine translation model.

In order to improve the accuracy of the translation results output by the multilingual machine translation model, in the above embodiment, as shown in FIG. The input word embedding layer and the output word embedding layer of the translation model better model word semantics. The translation method based on the multilingual machine translation model further comprises: when receiving the original word embedding output data of the word embedding layer, inputting the original word embedding output data into the second target adapter of the word embedding layer, and obtain the word embedding correction parameters output by the second target adapter, wherein the word embedding layer is an input word embedding layer or an output word embedding layer; using the word embedding correction parameters to correct the original word embedding output data , and obtain the target word embedding output data of the word embedding layer, so as to use the target word embedding output data as the target input data of the sub-layer components connected by the word embedding layer.

The second target adapter can be understood as an adapter configured by the input word embedding layer or the output word embedding layer to correct the original word embedding output data of the input word embedding layer or the output word embedding layer. The original word embedding output data can be understood as the data output by the word embedding layer.

Exemplarily, after obtaining the first original word embedding output data output by the input word embedding layer, the electronic device may first obtain a second target adapter corresponding to the identification information of the input word embedding layer, and output the first original word embedding. The data is input into the second target adapter, and the data output by the second target adapter is obtained as the word embedding correction parameter; then the first original word embedding output data is corrected using the word embedding correction parameter, and the corrected word embedding The first original word embedding output data is used as the target word embedding output data of the input word embedding layer, and the target word embedding output data is input into the self-attention layer connected to the output end of the input word embedding layer. After obtaining the second original word embedding output data output by the output word embedding layer, the electronic device may first obtain a second target adapter corresponding to the identification information of the output word embedding layer, and input the second original word embedding output data into the output word embedding layer. In the second target adapter, and obtain the data output by the second target adapter as the word embedding correction parameter; then use the word embedding correction parameter to correct the second original word embedding output data, and the corrected second original word The embedding output data is used as the target word embedding output data of the output word embedding layer, and the target word embedding output data is input into the self-attention layer connected to the output end of the output word embedding layer.

In the translation method based on the multilingual machine translation model provided by this embodiment, adapters corresponding to different translation language information are set for each encoder component and each decoder component in the multilingual machine translation model, and the original sentence When translating, the adapters set by each encoder component and each decoder component are used to correct the data output by the corresponding encoder or decoder, which can improve the multilingual machine translation model on the premise of adding fewer parameters. translation accuracy.

FIG. 6 is a structural block diagram of a translation apparatus based on a multilingual machine translation model according to an embodiment of the present disclosure. The apparatus can be implemented by software and/or hardware, and can be configured in electronic equipment. For example, the apparatus can be configured in mobile phones, tablet computers, or computer equipment, and can perform sentence translation by executing a translation method based on a multilingual machine translation model.

As shown in FIG. 6 , the translation apparatus based on a multilingual machine translation model provided in this embodiment may include: a sentence acquisition module 601, an adapter determination module 602, and a translation module 603, wherein the sentence acquisition module 601 is configured to acquire the to-be-translated The original sentence and the translated language information of the original sentence; the adapter determination module 602 is configured to determine a target adapter corresponding to the translated language information of the original sentence, wherein the target adapter is used to correct a preset multilingual machine translation The translation error of the model; the translation module 603 is configured to translate the original sentence based on the multilingual machine translation model and the target adapter to obtain the target sentence.

In the translation device based on the multilingual machine translation model provided by this embodiment, the original sentence to be translated and the translated language information of the original sentence are obtained through the sentence acquisition module, and the translation language information corresponding to the translated language information of the original sentence is determined through the adapter determination module. The target adapter is used to correct the translation error of the preset multilingual machine translation model, and the original sentence is translated by the translation module based on the multilingual machine translation model and the target adapter to obtain the target sentence. In this embodiment, by adopting the above technical solution and using an adapter to correct the translation error of the multilingual machine translation model, the accuracy of the translation result output by the multilingual translation model can be improved.

Optionally, the multilingual machine translation model includes an encoder and a decoder, the encoder includes at least one encoder sub-layer component, and the encoder sub-layer component is composed of at least one encoder sub-layer; the The decoder includes at least one decoder sub-layer component, the decoder sub-layer component is composed of at least one decoder sub-layer, and each encoder sub-layer component and each decoder sub-layer component are provided with different translation languages. Information corresponds to different adapters.

Optionally, the translation module 603 is configured to: use the multilingual machine translation model to translate the original sentence, and use the first target adapter of each sub-level component to translate the original sentence of each sub-level component. The output value is corrected to obtain the target sentence, and the sub-level components include an encoder sub-level component and/or a decoder sub-level component.

Optionally, the translation module 603 includes: a component determination unit, configured to determine the first sub-level component in the multi-language machine translation model according to the connection relationship of multiple sub-level components in the multi-language machine translation model. is the current sub-level component, and acquires the target input data of the current sub-level component; the adapter acquisition unit is set to determine the first target adapter of the current sub-level component as the current target adapter; the parameter determination unit is set to The target input data is input into the current sub-level component and the current target adapter, so as to obtain the original output data of the current sub-level component and the current correction parameters output by the current target adapter; the correction unit is set to The original output data is corrected by using the current correction parameters to obtain the target output data of the current sub-level component; the calling unit is configured to determine the target output data as the target input data of the next-level component, and Determine the next-level component as the current-level component, and return to call the adapter acquisition unit until there is no next-level component; the input unit is set to be the current-level component when there is no next-level component. The target output data of the layer component is input into the next layer of the current sub-layer component to obtain the target sentence.

In the above solution, the multilingual machine translation model further includes an input word embedding layer and an output word embedding layer, the output end of the input word embedding layer and the first encoder sub-layer component in the multilingual machine translation model The input end of the output word embedding layer is connected to the input end of the first decoder sub-layer component in the multilingual machine translation model.

Optionally, the translation device based on the multi-language machine translation model provided by this embodiment further includes: an adapter input module, configured to input the original word embedding output data when receiving the original word embedding output data of the word embedding layer. into the second target adapter of the word embedding layer, and obtain the word embedding correction parameters output by the second target adapter, wherein the word embedding layer is the input word embedding layer or the output word embedding layer; the embedding layer correction module , set to use the word embedding correction parameters to correct the original word embedding output data to obtain the target word embedding output data of the word embedding layer, so as to connect the target word embedding output data as the word embedding layer The target input data of the sub-level component.

Optionally, the translation apparatus based on the multilingual machine translation model provided in this embodiment further includes: an adapter training module, configured to, before the acquisition of the original sentence to be translated and the translation language information of the original sentence, for each A translation language information is obtained, a plurality of training samples consistent with the translated language information are obtained, and the multiple training samples are input into a multilingual machine translation model, so as to obtain an adapter corresponding to the translated language information by training.

The translation apparatus based on the multilingual machine translation model provided by the embodiment of the present disclosure can execute the translation method based on the multilingual machine translation model provided by any embodiment of the present disclosure, and has corresponding functional modules for executing the translation method based on the multilingual machine translation model and Effect. For technical details not described in detail in this embodiment, reference may be made to the translation method based on a multilingual machine translation model provided by any embodiment of the present disclosure.

Referring next to FIG. 7 , it shows a schematic structural diagram of an electronic device (eg, a terminal device) 700 suitable for implementing an embodiment of the present disclosure. Terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistants, PDAs), tablet computers (PADs), and portable multimedia players (Portable Media Players). , PMP), in-vehicle terminals (eg, in-vehicle navigation terminals), etc., and stationary terminals such as digital (Television, TV), desktop computers, and the like. The electronic device shown in FIG. 7 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.

As shown in FIG. 7 , the electronic device 700 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 701, which may be based on a program stored in a read-only memory (Read-Only Memory, ROM) 702 or from a storage device 708 programs loaded into Random Access Memory (RAM) 703 to perform various appropriate actions and processes. In the RAM 703, various programs and data required for the operation of the electronic device 700 are also stored. The processing device 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. An Input/Output (I/O) interface 705 is also connected to the bus 704 .

Typically, the following devices can be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a Liquid Crystal Display (LCD) Output device 707 , speaker, vibrator, etc.; storage device 708 , including, for example, magnetic tape, hard disk, etc.; and communication device 709 . Communication means 709 may allow electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. Although FIG. 7 shows an electronic device 700 having various means, it is not required to implement or have all of the illustrated means. More or fewer devices may alternatively be implemented or provided.

According to embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network via the communication device 709, or from the storage device 708, or from the ROM 702. When the computer program is executed by the processing device 701, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.

The computer-readable medium described above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. Examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, RAM, ROM, Erasable Programmable Read-Only Memory (EPROM) or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In this disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present disclosure, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device . The program code embodied on the computer readable medium can be transmitted by any suitable medium, including but not limited to: electric wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the above.

In some embodiments, clients and servers can communicate using any currently known or future developed network protocols, such as HyperText Transfer Protocol (HTTP), and can communicate with digital data in any form or medium. Communication (eg, a communication network) interconnects. Examples of communication networks include Local Area Networks (LANs), Wide Area Networks (WANs), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently Known or future developed networks.

The above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.

The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: obtains the original sentence to be translated and the translation language information of the original sentence; The target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct the translation error of a preset multilingual machine translation model; based on the multilingual machine translation model and the target adapter, the The original sentence is translated to obtain the target sentence.

Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a LAN or WAN, or, alternatively, may be connected to an external computer (eg, using an Internet service provider to connect through the Internet).

The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner. In this case, the name of the module does not constitute a limitation on the unit itself.

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (Application Specific Standard Products) Standard Parts, ASSP), system on chip (System on Chip, SOC), complex programmable logic device (Complex Programmable Logic Device, CPLD) and so on.

In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing. Examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, RAM, ROM, EPROM or flash memory, optical fibers, CD-ROMs, optical storage devices, magnetic storage devices, or Any suitable combination of the above.

According to one or more embodiments of the present disclosure, Example 1 provides a translation method based on a multilingual machine translation model, including:

Obtaining the original sentence to be translated and the translation language information of the original sentence; determining a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct the translation of a preset multilingual machine translation model error; translate the original sentence based on the multilingual machine translation model and the target adapter to obtain a target sentence.

According to one or more embodiments of the present disclosure, Example 2 According to the method of Example 1, the multilingual machine translation model includes an encoder and a decoder, and the encoder includes at least one encoder sub-level component, so The encoder sub-layer component is composed of at least one encoder sub-layer; the decoder includes at least one decoder sub-layer component, and the decoder sub-layer component is composed of at least one decoder sub-layer, each encoder sub-layer The layer component and each decoder sub-layer component are provided with different adapters corresponding to different translated language information.

According to one or more embodiments of the present disclosure, Example 3. According to the method of Example 2, the translation of the original sentence based on the multilingual machine translation model and the target adapter to obtain the target sentence includes:

Use the multilingual machine translation model to translate the original sentence, and use the first target adapter of each sub-level component to correct the original output value of each sub-level component to obtain the target sentence, Each of the sub-layer components includes an encoder sub-layer component and/or a decoder sub-layer component.

According to one or more embodiments of the present disclosure, Example 4 According to the method of Example 3, the original sentence is translated using the multilingual machine translation model, and the first target of each sub-level component is adopted The adapter corrects the raw output value of each sub-level component to obtain the target statement, including:

According to the connection relationship of multiple sub-level components in the multilingual machine translation model, determine the first sub-level component in the multi-language machine translation model as the current sub-level component, and obtain the current sub-level component. target input data; determine the first target adapter of the current sub-level component as the current target adapter; input the target input data into the current sub-level component and the current target adapter to obtain the current sub-level component The original output data of the layer component and the current correction parameters output by the current target adapter; use the current correction parameters to correct the original output data to obtain the target output data of the current sub-layer component; output the target The data is determined as the target input data of the next-level component, and the next-level component is determined as the current sub-level component, and the operation of determining the first target adapter of the current sub-level component as the current target adapter is returned to be executed , until there is no next-level component; when there is no next-level component, input the target output data of the current sub-level component into the next layer of the current sub-level component to obtain the target statement .

According to one or more embodiments of the present disclosure, Example 5 According to the method of Example 3 or 4, the multilingual machine translation model further includes an input word embedding layer and an output word embedding layer, the output of the input word embedding layer The terminal is connected to the input terminal of the first encoder sub-layer component in the multilingual machine translation model, and the output terminal of the output word embedding layer is connected to the input terminal of the first decoder sub-layer component in the multilingual machine translation model. connected to the input.

According to one or more embodiments of the present disclosure, Example 6 is the method of Example 5, further comprising:

When receiving the original word embedding output data of the word embedding layer, input the original word embedding output data into the second target adapter of the word embedding layer, and obtain the word embedding correction parameters output by the second target adapter , wherein the word embedding layer is the input word embedding layer or the output word embedding layer; the original word embedding output data is corrected by using the word embedding correction parameter to obtain the target word of the word embedding layer Embedding output data to use the target word embedding output data as target input data for sub-layer components connected by the word embedding layer.

According to one or more embodiments of the present disclosure, Example 7, according to the method described in any one of Examples 1-4, before the acquiring the original sentence to be translated and the translation language information of the original sentence, further includes:

For each kind of translated language information, obtain multiple training samples that match the translated language information, and input the multiple training samples into the multilingual machine translation model, so as to obtain training samples that match the translation language information. The adapter corresponding to the translated language information.

According to one or more embodiments of the present disclosure, Example 8 provides a translation apparatus based on a multilingual machine translation model, including:

A statement acquisition module, configured to acquire the original statement to be translated and the translation language information of the original statement; an adapter determination module, configured to determine a target adapter corresponding to the translated language information of the original statement, wherein the target adapter uses for correcting the translation error of the preset multilingual machine translation model; the translation module is configured to translate the original sentence based on the multilingual machine translation model and the target adapter to obtain the target sentence.

According to one or more embodiments of the present disclosure, Example 9 provides an electronic device, comprising:

one or more processors; a memory configured to store one or more programs; when the one or more programs are executed by the one or more processors, causing the one or more processors to implement as in Example 1 The translation method based on a multilingual machine translation model described in any one of -7.

According to one or more embodiments of the present disclosure, Example 10 provides a computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements the method based on any of Examples 1-7. Translation methods for multilingual machine translation models.

Additionally, although operations are depicted in a particular order, this should not be construed as requiring that the operations be performed in the particular order shown or in a sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while the above discussion contains several implementation details, these should not be construed as limitations on the scope of the present disclosure. Some features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Claims

A translation method based on a multilingual machine translation model, comprising:

obtaining the original sentence to be translated and the translated language information of the original sentence;

determining a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct translation errors of a preset multilingual machine translation model;

The original sentence is translated based on the multilingual machine translation model and the target adapter to obtain a target sentence.
The method of claim 1, wherein the multilingual machine translation model includes an encoder and a decoder, the encoder including at least one encoder sub-level component, the encoder sub-level component encoded by at least one The decoder includes at least one decoder sub-layer component, the decoder sub-layer component is composed of at least one decoder sub-layer, each encoder sub-layer component and each decoder sub-layer component Each is provided with different adapters corresponding to different translation language information.
The method according to claim 2, wherein the translation of the original sentence based on the multilingual machine translation model and the target adapter to obtain the target sentence comprises:

Use the multilingual machine translation model to translate the original sentence, and use the first target adapter of each sub-level component to correct the original output value of each sub-level component to obtain the target sentence, Each of the sub-layer components includes at least one of an encoder sub-layer component and a decoder sub-layer component.
The method according to claim 3, wherein the original sentence is translated by using the multi-language machine translation model, and the first target adapter of each sub-level component is used to translate the translation of each sub-level component. The raw output values are corrected to obtain the target sentence, including:

According to the connection relationship of multiple sub-level components in the multilingual machine translation model, determine the first sub-level component in the multi-language machine translation model as the current sub-level component, and obtain the current sub-level component. target input data;

determining the first target adapter of the current sub-level component as the current target adapter;

inputting the target input data into the current sub-level component and the current target adapter to obtain the original output data of the current sub-level component and the current correction parameters output by the current target adapter;

Correcting the original output data using the current correction parameters to obtain the target output data of the current sub-level component;

Determining the target output data as the target input data of the next-level component, and determining the next-level component as the current sub-level component, and returning to execute the process of determining the first target adapter of the current sub-level component as The operation of the current target adapter until there is no next layer component;

In the case where there is no next-level component, the target output data of the current sub-level component is input into the next layer of the current sub-level component, so as to obtain the target sentence.
The method according to claim 3 or 4, wherein the multilingual machine translation model further comprises an input word embedding layer and an output word embedding layer, and the output of the input word embedding layer is the same as that in the multilingual machine translation model. The input end of the first encoder sub-layer component is connected to the input end of the first encoder sub-layer component, and the output end of the output word embedding layer is connected to the input end of the first decoder sub-layer component in the multilingual machine translation model.
The method of claim 5, further comprising:

In the case of receiving the original word embedding output data of the word embedding layer, input the original word embedding output data into the second target adapter of the word embedding layer, and obtain the word embedding output by the second target adapter Correction parameters, wherein the word embedding layer is the input word embedding layer or the output word embedding layer;

The original word embedding output data is corrected using the word embedding correction parameters to obtain the target word embedding output data of the word embedding layer, so as to use the target word embedding output data as a sub-layer connected by the word embedding layer The component's target input data.
The method according to any one of claims 1-4, before acquiring the original sentence to be translated and the translation language information of the original sentence, further comprising:

For each kind of translated language information, obtain multiple training samples that match the translated language information, and input the multiple training samples into the multilingual machine translation model, so as to obtain training samples that match the translation language information. The adapter corresponding to the translated language information.
A translation device based on a multilingual machine translation model, comprising:

A statement acquisition module, configured to acquire the original statement to be translated and the translation language information of the original statement;

an adapter determination module, configured to determine a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct translation errors of a preset multilingual machine translation model;

A translation module, configured to translate the original sentence based on the multilingual machine translation model and the target adapter to obtain a target sentence.
An electronic device comprising:

at least one processor;

a memory, arranged to store at least one program;

When the at least one program is executed by the at least one processor, the at least one processor implements the translation method based on a multilingual machine translation model according to any one of claims 1-7.
A computer-readable storage medium storing a computer program, wherein when the program is executed by a processor, the translation method based on a multilingual machine translation model according to any one of claims 1-7 is implemented.