WO2022116821A1 - Translation method and apparatus employing multi-language machine translation model, device, and medium - Google Patents

Translation method and apparatus employing multi-language machine translation model, device, and medium Download PDF

Info

Publication number
WO2022116821A1
WO2022116821A1 PCT/CN2021/131090 CN2021131090W WO2022116821A1 WO 2022116821 A1 WO2022116821 A1 WO 2022116821A1 CN 2021131090 W CN2021131090 W CN 2021131090W WO 2022116821 A1 WO2022116821 A1 WO 2022116821A1
Authority
WO
WIPO (PCT)
Prior art keywords
sub
target
layer
machine translation
word embedding
Prior art date
Application number
PCT/CN2021/131090
Other languages
French (fr)
Chinese (zh)
Inventor
赵程绮
朱耀明
王明轩
封江涛
李磊
Original Assignee
北京有竹居网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京有竹居网络技术有限公司 filed Critical 北京有竹居网络技术有限公司
Publication of WO2022116821A1 publication Critical patent/WO2022116821A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Definitions

  • the present disclosure relates to the field of computer technology, for example, to a translation method, apparatus, device and medium based on a multilingual machine translation model.
  • Machine Translation is one of the core tasks in natural language processing, which aims to use computer programs to translate one natural language into another natural language.
  • Traditional machine translation models are generally bilingual machine translation models, which can handle translation in one language direction, such as translating English into Chinese.
  • a large number of bilingual machine translation models need to be trained to achieve pairwise translation between each pair of natural languages.
  • the multilingual machine translation model gradually replaces the bilingual machine translation model and becomes a commonly used machine translation model. one.
  • the performance of the multilingual machine translation model is often inferior to that of the bilingual machine translation model, resulting in large translation errors in the translation results output by the multilingual machine translation model.
  • the present disclosure provides a translation method, apparatus, device and medium based on a multilingual machine translation model, so as to improve the accuracy of the translation result output by the multilingual machine translation model.
  • the present disclosure provides a translation method based on a multilingual machine translation model, including:
  • the original sentence is translated based on the multilingual machine translation model and the target adapter to obtain a target sentence.
  • the present disclosure also provides a translation device based on a multilingual machine translation model, including:
  • a statement acquisition module configured to acquire the original statement to be translated and the translation language information of the original statement
  • an adapter determination module configured to determine a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct translation errors of a preset multilingual machine translation model
  • a translation module configured to translate the original sentence based on the multilingual machine translation model and the target adapter to obtain a target sentence.
  • the present disclosure also provides an electronic device, comprising:
  • processors one or more processors
  • memory arranged to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the above-mentioned translation method based on a multilingual machine translation model.
  • the present disclosure also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the above-mentioned translation method based on a multilingual machine translation model.
  • FIG. 1 is a schematic flowchart of a translation method based on a multilingual machine translation model provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic structural diagram of an adapter according to an embodiment of the present disclosure
  • FIG. 3 is a schematic structural diagram of a multilingual machine translation model according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic diagram of a connection relationship of a target adapter according to an embodiment of the present disclosure
  • FIG. 5 is a schematic flowchart of another translation method based on a multilingual machine translation model provided by an embodiment of the present disclosure
  • FIG. 6 is a structural block diagram of a translation device based on a multilingual machine translation model provided by an embodiment of the present disclosure
  • FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
  • method embodiments of the present disclosure may be performed in different orders, and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this regard.
  • the term “including” and variations thereof are open-ended inclusions, ie, "including but not limited to”.
  • the term “based on” is “based at least in part on.”
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • FIG. 1 is a schematic flowchart of a translation method based on a multilingual machine translation model according to an embodiment of the present disclosure.
  • the method may be performed by a translation apparatus based on a multilingual machine translation model, wherein the apparatus may be implemented by software and/or hardware, and may be configured in an electronic device, for example, the apparatus may be configured in a mobile phone, a tablet computer or a computer device .
  • the translation method based on the multilingual machine translation model provided by this embodiment may include:
  • the original sentence is the sentence that needs to be translated this time, which can be input by the user through an input device such as a keyboard or recognized by text recognition or voice recognition, that is, when the user needs to translate a sentence, it can be input through text or voice.
  • the input method is to input the sentence into the electronic device, and also can take a picture containing the sentence or obtain the text containing the sentence, and import the picture or text into the electronic device.
  • the translation method based on the multilingual machine translation model provided in this embodiment can translate the text or voice input by the user, and can also translate the sentences contained in the pictures or text imported by the user.
  • the sentence When translating a sentence in a voice or picture, the sentence can be converted into an original sentence in text form before translation.
  • the translation language information of the original sentence can be understood as the translation direction information during this translation, which can include the original language information of this translation (that is, the language information to which the original sentence to be translated belongs) and the target language information (that is, the original The language information to which the target sentence to which the sentence is translated belongs), the language information may be, for example, English, Chinese, or German.
  • the user when a user needs to translate an original sentence, the user inputs the original sentence, the language information of the original language to which the original sentence belongs, and the language information of the target language to which the original sentence needs to be translated into the electronic device to generate a The translation instruction for the original sentence; correspondingly, when receiving the translation instruction for the original sentence, the electronic device obtains the original sentence, and determines the translation language information of the original sentence, such as the original language selected by the user on the translation page.
  • the language information is determined as the original language information
  • the language information of the target language selected by the user on the translation page is determined as the target translation language information.
  • adapters for the multilingual machine translation model in different translation scenarios can be set, and the multilingual machine translation model can be used for a translation scenario.
  • the adapter corresponding to the translation scene is used to correct the translation error of the multilingual machine translation model, thereby improving the accuracy of the translation result output by the multilingual machine translation model.
  • the parameter amount of the adapter is very small (the parameter amount is less than one-twentieth of the multilingual machine translation model, and the larger the multilingual machine translation model is, the smaller the ratio is), therefore, by configuring the adapter, correcting more The translation error of the language machine translation model is very small, which is easy to deploy.
  • the electronic device After acquiring the translation language information of the original sentence, the electronic device can acquire the adapter corresponding to the translation language information from the preset multiple adapters according to the translation language information as the target adapter.
  • the number of adapters corresponding to a translation language information may be one or more, that is, in this embodiment, only one adapter corresponding to each translation language information may be set.
  • this adapter can be used to correct the translation error of the multilingual machine translation model; it is also possible to set up multiple adapters corresponding to each translation language information.
  • the multiple adapters can be used to correct the translation errors of the multilingual machine translation model, thereby improving the accuracy of the translation result output by the multilingual machine translation model.
  • the following is an example of this situation.
  • the adapters used can be different; the structure of the adapter can be selected flexibly.
  • An activation function may be configured between the feed layer and the second feedforward layer, and the activation function may be a Gaussian Error Linear Unit (GULU), as shown in Figure 2.
  • GUI Gaussian Error Linear Unit
  • the type of the multilingual machine translation model can be set as required, for example, the multilingual machine translation model can be a Transformer model, as shown in FIG. 3 (in FIG. 3 only one encoder and one decoder), the multilingual machine translation model may include at least one encoder and at least one decoder.
  • the multilingual machine translation model includes multiple encoders and multiple decoders, such as including 6 encoders and 6 decoders, and each encoder is provided with at least a self-attention layer and a feedforward layer Two encoder sub-layers, each decoder is provided with at least three decoder sub-layers: self-attention layer, encoding-decoding attention layer and feed-forward layer, multiple encoders are connected in series, and multiple decoders are connected in series, The feedforward layer of the last encoder is concatenated with the encoder-decoder attention layer of each decoder.
  • corresponding adapters may be set for the encoder and decoder in the language machine translation model;
  • the encoder sub-layer and the decoder sub-layer are respectively set with corresponding adapters, which are not limited in this implementation.
  • At least one encoder sub-layer in each encoder may be Divide into one or more encoder sub-layer components, divide at least one decoder sub-layer in each decoder into one or more decoder sub-layer components, and divide for each encoder sub-layer component and each decoder sub-layer component
  • the sub-level components of the device are respectively set with corresponding adapters.
  • the multilingual machine translation model includes an encoder and a decoder, the encoder includes at least one encoder sub-layer component, and the encoder sub-layer component is composed of at least one encoder sub-layer; the decoding The encoder includes at least one decoder sub-layer component, the decoder sub-layer component is composed of at least one decoder sub-layer, and each encoder sub-layer component and each decoder sub-layer component are provided with different translation language information Corresponding different adapters.
  • the values of the parameters in the adapters corresponding to different sub-layers can be different; the self-attention layer and the feed-forward layer in each encoder can separately form an encoder sub-layer component, and the self-attention layer in each decoder
  • the force layer and the encoder-decoder attention layer can together form a decoder sub-layer component, and the feedforward layer in each decoder can independently form a decoder sub-layer component, as shown in Figure 4 ( Figure 4 is only an exemplary out an encoder and a decoder).
  • the method before the acquiring the original sentence to be translated and the translated language information of the original sentence, the method further includes: for each translated language information, acquiring a plurality of training samples consistent with the translated language information , and input the plurality of training samples into the multilingual machine translation model, so as to obtain the adapter corresponding to the translated language information by training.
  • the adapters corresponding to each sub-level component (including the encoder sub-level component and the decoder sub-level component) under different translation language information can be obtained through training.
  • the parameters in the multi-language machine translation model can be fixed, and the parameters in the multi-language machine translation model can be fixed for each level of the multi-language machine translation model.
  • the layer component sets the original adapter, uses training samples consistent with the translation language information to train each original adapter, uses test samples to test the translation error of the multilingual machine translation model, and sets the translation error of the multilingual machine translation model to less than
  • the adapter corresponding to each sub-level component at the preset error threshold is determined as its corresponding adapter under the translation language information.
  • the target sentence is the sentence translated from the original sentence.
  • the data output by the multilingual machine translation model can be obtained, and the target adapter can be used to The data is corrected to obtain the target sentence;
  • the target adapter can be used to The data is corrected to obtain the target sentence;
  • the adapter corresponding to the decoder/the decoder corrects the data, inputs the corrected data into the next layer, and determines the sentence output by the multilingual machine translation model as the target sentence;
  • the adapter corresponding to the sub-level component can be used to correct the original data, and the The corrected target data is input into the next layer, and the sentence output by the multilingual machine translation model is determined as the target sentence, and the original sentence is translated
  • the translation method based on the multilingual machine translation model acquires the original sentence to be translated and the translation language information of the original sentence, and determines the multilingual information corresponding to the translation language information of the original sentence and is used to correct the preset multi-language
  • the target adapter of the translation error of the machine translation model and based on the multilingual machine translation model and the target adapter, the original sentence is translated to obtain the target sentence.
  • the accuracy of the translation result output by the multilingual translation model can be improved.
  • FIG. 5 is a schematic flowchart of another translation method based on a multilingual machine translation model provided by an embodiment of the present disclosure.
  • the solution in this embodiment may be combined with one or more optional solutions in the foregoing embodiments.
  • the multilingual machine translation model is used to translate the original sentence
  • the translation method based on the multilingual machine translation model may include:
  • the target input data can be understood as the data input into the current sub-level component.
  • the original sentence when translating the original sentence, the original sentence can be input into the multi-language machine translation model, and according to the connection relationship between the multiple layers of the multi-language machine translation model, the output of each layer to the previous layer is controlled in turn.
  • the information is processed, and when the target output data of the previous layer of the first sub-level component in the multilingual machine translation model is obtained, the first sub-level component is determined as the current sub-level component, and its previous layer is determined.
  • the target output data of is determined as the target input data of the current sub-level component.
  • the first target adapter can be understood as the adapter configured by the sub-level component and used for correcting the original output data of the sub-level component, that is, the adapter configured by the encoder sub-level component of the encoder or the decoder sub-level component of the decoder .
  • the target adapter corresponding to the translation language information of the original sentence may include the first target adapter configured by each sub-level component in the multilingual machine translation model; it may also include in the multilingual machine translation model
  • the word embedding layer (including the input word embedding layer and the output word embedding layer) is configured with the second target adapter.
  • the first target adapter corresponding to the component identification information of the current sub-level component may be selected from the plurality of target adapters determined in S202 as the current target adapter according to the component identification information of the current sub-level component.
  • S205 Input the target input data into the current sub-level component and the current target adapter, so as to obtain the original output data of the current sub-level component and the current correction parameters output by the current target adapter.
  • the original output data of the current sub-level component can be understood as the data before correction obtained by the current sub-level component operating on its target input data.
  • the current correction parameter can be understood as a parameter for correcting the original output data of the current sub-level component, which can be calculated by the current target adapter configured by the current sub-level component according to the target input data of the current sub-level component.
  • the target input data of the current sub-level component can be input into the current sub-level component and the current target adapter, and the data output by the current sub-level component can be obtained as the current sub-level component.
  • the raw output data of the sub-level component, and the data output by the current target adapter are obtained as the current calibration parameters.
  • the current correction parameters can be used to correct the original output data, for example, the original output data is corrected to the sum of the original output data and the current correction parameters, and the data obtained by correcting the original output data is determined as the current sub-layer component. target output data.
  • the connection relationship of multiple sub-level components in the multilingual machine translation model it can be judged whether the next layer connected to the output end of the current sub-level component is located in the primary level component, if the output of the current sub-level component is located in the primary level component. If the next layer connected to the terminal is located in the primary layer component, it is determined that there is a next layer component, and the sublayer component to which the next layer belongs is determined as the next layer component.
  • the location of the multiple sub-layers connected to the output terminal of the current sub-layer component can be determined according to the data flow in the multilingual machine translation model.
  • the sub-level component through which the data first flows is determined as the next-level component.
  • S208 Determine the target output data as the target input data of the next-level component, determine the next-level component as the current sub-level component, and return to executing S204.
  • S209 Input the target output data of the current sub-level component into the next layer of the current sub-level component to obtain a target sentence.
  • the electronic device can output the target output data to the multilingual machine translation model.
  • the multilingual machine translation model can output the target sentence obtained by translating the original sentence through the output layer.
  • the multilingual machine translation model further includes an input word embedding layer and an output word embedding layer, and the output of the input word embedding layer is connected to the first encoder sub-layer in the multilingual machine translation model.
  • the input end of the component is connected, and the output end of the output word embedding layer is connected to the input end of the first decoder sub-layer component in the multilingual machine translation model.
  • the multilingual machine translation model may also be provided with an input word embedding layer and an output word embedding layer, and the output end of the input word embedding layer may be the same as that of the first word embedding layer in the multilingual machine translation model.
  • the input of the self-attention layer of an encoder is connected, and the output of the output word embedding layer can be connected to the input of the self-attention layer of the first decoder in the multilingual machine translation model.
  • the input word embedding layer and the output word embedding layer of the translation model better model word semantics.
  • the translation method based on the multilingual machine translation model further comprises: when receiving the original word embedding output data of the word embedding layer, inputting the original word embedding output data into the second target adapter of the word embedding layer, and obtain the word embedding correction parameters output by the second target adapter, wherein the word embedding layer is an input word embedding layer or an output word embedding layer; using the word embedding correction parameters to correct the original word embedding output data , and obtain the target word embedding output data of the word embedding layer, so as to use the target word embedding output data as the target input data of the sub-layer components connected by the word embedding layer.
  • the second target adapter can be understood as an adapter configured by the input word embedding layer or the output word embedding layer to correct the original word embedding output data of the input word embedding layer or the output word embedding layer.
  • the original word embedding output data can be understood as the data output by the word embedding layer.
  • the electronic device may first obtain a second target adapter corresponding to the identification information of the input word embedding layer, and output the first original word embedding.
  • the data is input into the second target adapter, and the data output by the second target adapter is obtained as the word embedding correction parameter; then the first original word embedding output data is corrected using the word embedding correction parameter, and the corrected word embedding
  • the first original word embedding output data is used as the target word embedding output data of the input word embedding layer, and the target word embedding output data is input into the self-attention layer connected to the output end of the input word embedding layer.
  • the electronic device may first obtain a second target adapter corresponding to the identification information of the output word embedding layer, and input the second original word embedding output data into the output word embedding layer.
  • the second target adapter and obtain the data output by the second target adapter as the word embedding correction parameter; then use the word embedding correction parameter to correct the second original word embedding output data, and the corrected second original word
  • the embedding output data is used as the target word embedding output data of the output word embedding layer, and the target word embedding output data is input into the self-attention layer connected to the output end of the output word embedding layer.
  • adapters corresponding to different translation language information are set for each encoder component and each decoder component in the multilingual machine translation model, and the original sentence When translating, the adapters set by each encoder component and each decoder component are used to correct the data output by the corresponding encoder or decoder, which can improve the multilingual machine translation model on the premise of adding fewer parameters. translation accuracy.
  • FIG. 6 is a structural block diagram of a translation apparatus based on a multilingual machine translation model according to an embodiment of the present disclosure.
  • the apparatus can be implemented by software and/or hardware, and can be configured in electronic equipment.
  • the apparatus can be configured in mobile phones, tablet computers, or computer equipment, and can perform sentence translation by executing a translation method based on a multilingual machine translation model.
  • the translation apparatus based on a multilingual machine translation model may include: a sentence acquisition module 601, an adapter determination module 602, and a translation module 603, wherein the sentence acquisition module 601 is configured to acquire the to-be-translated The original sentence and the translated language information of the original sentence; the adapter determination module 602 is configured to determine a target adapter corresponding to the translated language information of the original sentence, wherein the target adapter is used to correct a preset multilingual machine translation The translation error of the model; the translation module 603 is configured to translate the original sentence based on the multilingual machine translation model and the target adapter to obtain the target sentence.
  • the original sentence to be translated and the translated language information of the original sentence are obtained through the sentence acquisition module, and the translation language information corresponding to the translated language information of the original sentence is determined through the adapter determination module.
  • the target adapter is used to correct the translation error of the preset multilingual machine translation model, and the original sentence is translated by the translation module based on the multilingual machine translation model and the target adapter to obtain the target sentence.
  • the accuracy of the translation result output by the multilingual translation model can be improved.
  • the multilingual machine translation model includes an encoder and a decoder
  • the encoder includes at least one encoder sub-layer component, and the encoder sub-layer component is composed of at least one encoder sub-layer
  • the The decoder includes at least one decoder sub-layer component, the decoder sub-layer component is composed of at least one decoder sub-layer, and each encoder sub-layer component and each decoder sub-layer component are provided with different translation languages.
  • Information corresponds to different adapters.
  • the translation module 603 is configured to: use the multilingual machine translation model to translate the original sentence, and use the first target adapter of each sub-level component to translate the original sentence of each sub-level component.
  • the output value is corrected to obtain the target sentence, and the sub-level components include an encoder sub-level component and/or a decoder sub-level component.
  • the translation module 603 includes: a component determination unit, configured to determine the first sub-level component in the multi-language machine translation model according to the connection relationship of multiple sub-level components in the multi-language machine translation model. is the current sub-level component, and acquires the target input data of the current sub-level component; the adapter acquisition unit is set to determine the first target adapter of the current sub-level component as the current target adapter; the parameter determination unit is set to The target input data is input into the current sub-level component and the current target adapter, so as to obtain the original output data of the current sub-level component and the current correction parameters output by the current target adapter; the correction unit is set to The original output data is corrected by using the current correction parameters to obtain the target output data of the current sub-level component; the calling unit is configured to determine the target output data as the target input data of the next-level component, and Determine the next-level component as the current-level component, and return to call the adapter acquisition unit until there is no next-level component; the input unit is set to be the current current
  • the multilingual machine translation model further includes an input word embedding layer and an output word embedding layer, the output end of the input word embedding layer and the first encoder sub-layer component in the multilingual machine translation model
  • the input end of the output word embedding layer is connected to the input end of the first decoder sub-layer component in the multilingual machine translation model.
  • the translation device based on the multi-language machine translation model further includes: an adapter input module, configured to input the original word embedding output data when receiving the original word embedding output data of the word embedding layer. into the second target adapter of the word embedding layer, and obtain the word embedding correction parameters output by the second target adapter, wherein the word embedding layer is the input word embedding layer or the output word embedding layer; the embedding layer correction module , set to use the word embedding correction parameters to correct the original word embedding output data to obtain the target word embedding output data of the word embedding layer, so as to connect the target word embedding output data as the word embedding layer The target input data of the sub-level component.
  • an adapter input module configured to input the original word embedding output data when receiving the original word embedding output data of the word embedding layer. into the second target adapter of the word embedding layer, and obtain the word embedding correction parameters output by the second target adapter, wherein the
  • the translation apparatus based on the multilingual machine translation model further includes: an adapter training module, configured to, before the acquisition of the original sentence to be translated and the translation language information of the original sentence, for each A translation language information is obtained, a plurality of training samples consistent with the translated language information are obtained, and the multiple training samples are input into a multilingual machine translation model, so as to obtain an adapter corresponding to the translated language information by training.
  • an adapter training module configured to, before the acquisition of the original sentence to be translated and the translation language information of the original sentence, for each A translation language information is obtained, a plurality of training samples consistent with the translated language information are obtained, and the multiple training samples are input into a multilingual machine translation model, so as to obtain an adapter corresponding to the translated language information by training.
  • the translation apparatus based on the multilingual machine translation model provided by the embodiment of the present disclosure can execute the translation method based on the multilingual machine translation model provided by any embodiment of the present disclosure, and has corresponding functional modules for executing the translation method based on the multilingual machine translation model and Effect.
  • the translation method based on a multilingual machine translation model provided by any embodiment of the present disclosure can execute the translation method based on the multilingual machine translation model provided by any embodiment of the present disclosure, and has corresponding functional modules for executing the translation method based on the multilingual machine translation model and Effect.
  • FIG. 7 it shows a schematic structural diagram of an electronic device (eg, a terminal device) 700 suitable for implementing an embodiment of the present disclosure.
  • Terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistants, PDAs), tablet computers (PADs), and portable multimedia players (Portable Media Players). , PMP), in-vehicle terminals (eg, in-vehicle navigation terminals), etc., and stationary terminals such as digital (Television, TV), desktop computers, and the like.
  • PMP Personal Digital Assistants
  • PDAs Personal Digital Assistants
  • PADs tablet computers
  • portable multimedia players Portable Media Players
  • PMP Personal Digital Assistants
  • in-vehicle terminals eg, in-vehicle navigation terminals
  • stationary terminals such as digital (Television, TV), desktop computers, and the like.
  • the electronic device shown in FIG. 7 is only an example, and
  • the electronic device 700 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 701, which may be based on a program stored in a read-only memory (Read-Only Memory, ROM) 702 or from a storage device 708 programs loaded into Random Access Memory (RAM) 703 to perform various appropriate actions and processes.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • various programs and data required for the operation of the electronic device 700 are also stored.
  • the processing device 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704.
  • An Input/Output (I/O) interface 705 is also connected to the bus 704 .
  • I/O interface 705 the following devices can be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a Liquid Crystal Display (LCD) Output device 707 , speaker, vibrator, etc.; storage device 708 , including, for example, magnetic tape, hard disk, etc.; and communication device 709 .
  • Communication means 709 may allow electronic device 700 to communicate wirelessly or by wire with other devices to exchange data.
  • FIG. 7 shows an electronic device 700 having various means, it is not required to implement or have all of the illustrated means. More or fewer devices may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 709, or from the storage device 708, or from the ROM 702.
  • the processing device 701 the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the computer-readable medium described above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above.
  • Examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, RAM, ROM, Erasable Programmable Read-Only Memory (EPROM) or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • the program code embodied on the computer readable medium can be transmitted by any suitable medium, including but not limited to: electric wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the above.
  • clients and servers can communicate using any currently known or future developed network protocols, such as HyperText Transfer Protocol (HTTP), and can communicate with digital data in any form or medium.
  • Communication eg, a communication network
  • Examples of communication networks include Local Area Networks (LANs), Wide Area Networks (WANs), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently Known or future developed networks.
  • LANs Local Area Networks
  • WANs Wide Area Networks
  • the Internet eg, the Internet
  • peer-to-peer networks eg, ad hoc peer-to-peer networks
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: obtains the original sentence to be translated and the translation language information of the original sentence; The target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct the translation error of a preset multilingual machine translation model; based on the multilingual machine translation model and the target adapter, the The original sentence is translated to obtain the target sentence.
  • Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a LAN or WAN, or, alternatively, may be connected to an external computer (eg, using an Internet service provider to connect through the Internet).
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner.
  • the name of the module does not constitute a limitation on the unit itself.
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (Application Specific Standard Products) Standard Parts, ASSP), system on chip (System on Chip, SOC), complex programmable logic device (Complex Programmable Logic Device, CPLD) and so on.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSP Application Specific Standard Products
  • SOC System on Chip
  • complex programmable logic device Complex Programmable Logic Device, CPLD
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing. Examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, RAM, ROM, EPROM or flash memory, optical fibers, CD-ROMs, optical storage devices, magnetic storage devices, or Any suitable combination of the above.
  • Example 1 provides a translation method based on a multilingual machine translation model, including:
  • Example 2 According to the method of Example 1, the multilingual machine translation model includes an encoder and a decoder, and the encoder includes at least one encoder sub-level component, so The encoder sub-layer component is composed of at least one encoder sub-layer; the decoder includes at least one decoder sub-layer component, and the decoder sub-layer component is composed of at least one decoder sub-layer, each encoder sub-layer
  • the layer component and each decoder sub-layer component are provided with different adapters corresponding to different translated language information.
  • Example 3 the translation of the original sentence based on the multilingual machine translation model and the target adapter to obtain the target sentence includes:
  • Each of the sub-layer components includes an encoder sub-layer component and/or a decoder sub-layer component.
  • Example 4 According to the method of Example 3, the original sentence is translated using the multilingual machine translation model, and the first target of each sub-level component is adopted
  • the adapter corrects the raw output value of each sub-level component to obtain the target statement, including:
  • the first sub-level component in the multi-language machine translation model as the current sub-level component, and obtain the current sub-level component.
  • target input data determine the first target adapter of the current sub-level component as the current target adapter; input the target input data into the current sub-level component and the current target adapter to obtain the current sub-level component
  • the original output data of the layer component and the current correction parameters output by the current target adapter use the current correction parameters to correct the original output data to obtain the target output data of the current sub-layer component; output the target
  • the data is determined as the target input data of the next-level component, and the next-level component is determined as the current sub-level component, and the operation of determining the first target adapter of the current sub-level component as the current target adapter is returned to be executed , until there is no next-level component; when there is no next-level component, input the target output data of the current sub-level component into the next layer of the current sub-level component to
  • Example 5 According to the method of Example 3 or 4, the multilingual machine translation model further includes an input word embedding layer and an output word embedding layer, the output of the input word embedding layer The terminal is connected to the input terminal of the first encoder sub-layer component in the multilingual machine translation model, and the output terminal of the output word embedding layer is connected to the input terminal of the first decoder sub-layer component in the multilingual machine translation model. connected to the input.
  • Example 6 is the method of Example 5, further comprising:
  • the original word embedding output data of the word embedding layer When receiving the original word embedding output data of the word embedding layer, input the original word embedding output data into the second target adapter of the word embedding layer, and obtain the word embedding correction parameters output by the second target adapter , wherein the word embedding layer is the input word embedding layer or the output word embedding layer; the original word embedding output data is corrected by using the word embedding correction parameter to obtain the target word of the word embedding layer Embedding output data to use the target word embedding output data as target input data for sub-layer components connected by the word embedding layer.
  • Example 7 according to the method described in any one of Examples 1-4, before the acquiring the original sentence to be translated and the translation language information of the original sentence, further includes:
  • For each kind of translated language information obtain multiple training samples that match the translated language information, and input the multiple training samples into the multilingual machine translation model, so as to obtain training samples that match the translation language information.
  • the adapter corresponding to the translated language information.
  • Example 8 provides a translation apparatus based on a multilingual machine translation model, including:
  • a statement acquisition module configured to acquire the original statement to be translated and the translation language information of the original statement; an adapter determination module, configured to determine a target adapter corresponding to the translated language information of the original statement, wherein the target adapter uses for correcting the translation error of the preset multilingual machine translation model; the translation module is configured to translate the original sentence based on the multilingual machine translation model and the target adapter to obtain the target sentence.
  • Example 9 provides an electronic device, comprising:
  • Example 1 The translation method based on a multilingual machine translation model described in any one of -7.
  • Example 10 provides a computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements the method based on any of Examples 1-7. Translation methods for multilingual machine translation models.

Abstract

Provided are a translation method and apparatus employing a multi-language machine translation model, a device, and a medium. The translation method employing a multi-language machine translation model comprises: acquiring an original phrase to be translated and translation language information of the original phrase (S101); determining a target adapter corresponding to the translation language information of the original phrase, wherein the target adapter is used to correct translation errors of a preset multi-language machine translation model (S102); and translating the original phrase on the basis of the multi-language machine translation model and the target adapter, and obtaining a target phrase (S103).

Description

基于多语言机器翻译模型的翻译方法、装置、设备和介质Translation method, device, device and medium based on multilingual machine translation model
本申请要求在2020年12月04日提交中国专利局、申请号为202011409340.1的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with application number 202011409340.1 filed with the China Patent Office on December 04, 2020, the entire contents of which are incorporated herein by reference.
技术领域technical field
本公开涉及计算机技术领域,例如涉及一种基于多语言机器翻译模型的翻译方法、装置、设备和介质。The present disclosure relates to the field of computer technology, for example, to a translation method, apparatus, device and medium based on a multilingual machine translation model.
背景技术Background technique
机器翻译(Machine Translation,MT)是自然语言处理方向中的核心任务之一,旨在利用计算机程序将一种自然语言翻译为另一种自然语言。Machine Translation (MT) is one of the core tasks in natural language processing, which aims to use computer programs to translate one natural language into another natural language.
传统的机器翻译模型一般为双语机器翻译模型,此类机器翻译模型能处理一个语言方向的翻译,如将英文翻译为中文。由于当语种的数量较大时,需要训练非常多的双语机器翻译模型才能实现每对自然语言之间的两两互译,多语言机器翻译模型逐渐替代双语机器翻译模型,成为常用的机器翻译模型之一。Traditional machine translation models are generally bilingual machine translation models, which can handle translation in one language direction, such as translating English into Chinese. When the number of languages is large, a large number of bilingual machine translation models need to be trained to achieve pairwise translation between each pair of natural languages. The multilingual machine translation model gradually replaces the bilingual machine translation model and becomes a commonly used machine translation model. one.
然而,在相同的参数配置和模型架构下,多语言机器翻译模型的性能往往劣于双语机器翻译模型的性能,导致多语言机器翻译模型输出的翻译结果存在较大的翻译误差。However, under the same parameter configuration and model architecture, the performance of the multilingual machine translation model is often inferior to that of the bilingual machine translation model, resulting in large translation errors in the translation results output by the multilingual machine translation model.
发明内容SUMMARY OF THE INVENTION
本公开提供一种基于多语言机器翻译模型的翻译方法、装置、设备和介质,以提高多语言机器翻译模型输出的翻译结果的准确性。The present disclosure provides a translation method, apparatus, device and medium based on a multilingual machine translation model, so as to improve the accuracy of the translation result output by the multilingual machine translation model.
本公开提供了一种基于多语言机器翻译模型的翻译方法,包括:The present disclosure provides a translation method based on a multilingual machine translation model, including:
获取待翻译的原始语句和所述原始语句的翻译语言信息;obtaining the original sentence to be translated and the translated language information of the original sentence;
确定与所述原始语句的翻译语言信息对应的目标适配器,其中,所述目标适配器用于校正预先设置的多语言机器翻译模型的翻译误差;determining a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct translation errors of a preset multilingual machine translation model;
基于所述多语言机器翻译模型和所述目标适配器对所述原始语句进行翻译,得到目标语句。The original sentence is translated based on the multilingual machine translation model and the target adapter to obtain a target sentence.
本公开还提供了一种基于多语言机器翻译模型的翻译装置,包括:The present disclosure also provides a translation device based on a multilingual machine translation model, including:
语句获取模块,设置为获取待翻译的原始语句和所述原始语句的翻译语言信息;A statement acquisition module, configured to acquire the original statement to be translated and the translation language information of the original statement;
适配器确定模块,设置为确定与所述原始语句的翻译语言信息对应的目标适配器,其中,所述目标适配器用于校正预先设置的多语言机器翻译模型的翻译误差;an adapter determination module, configured to determine a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct translation errors of a preset multilingual machine translation model;
翻译模块,设置为基于所述多语言机器翻译模型和所述目标适配器对所述原始语句进行翻译,得到目标语句。A translation module, configured to translate the original sentence based on the multilingual machine translation model and the target adapter to obtain a target sentence.
本公开还提供了一种电子设备,包括:The present disclosure also provides an electronic device, comprising:
一个或多个处理器;one or more processors;
存储器,设置为存储一个或多个程序;memory, arranged to store one or more programs;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现上述的基于多语言机器翻译模型的翻译方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the above-mentioned translation method based on a multilingual machine translation model.
本公开还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述的基于多语言机器翻译模型的翻译方法。The present disclosure also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the above-mentioned translation method based on a multilingual machine translation model.
附图说明Description of drawings
图1为本公开实施例提供的一种基于多语言机器翻译模型的翻译方法的流程示意图;1 is a schematic flowchart of a translation method based on a multilingual machine translation model provided by an embodiment of the present disclosure;
图2为本公开实施例提供的一种适配器的结构示意图;FIG. 2 is a schematic structural diagram of an adapter according to an embodiment of the present disclosure;
图3为本公开实施例提供的一种多语言机器翻译模型的结构示意图;3 is a schematic structural diagram of a multilingual machine translation model according to an embodiment of the present disclosure;
图4为本公开实施例提供的一种目标适配器的连接关系示意图;FIG. 4 is a schematic diagram of a connection relationship of a target adapter according to an embodiment of the present disclosure;
图5为本公开实施例提供的另一种基于多语言机器翻译模型的翻译方法的流程示意图;5 is a schematic flowchart of another translation method based on a multilingual machine translation model provided by an embodiment of the present disclosure;
图6为本公开实施例提供的一种基于多语言机器翻译模型的翻译装置的结构框图;6 is a structural block diagram of a translation device based on a multilingual machine translation model provided by an embodiment of the present disclosure;
图7为本公开实施例提供的一种电子设备的结构示意图。FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
具体实施方式Detailed ways
下面将参照附图描述本公开的实施例。虽然附图中显示了本公开的一些实施例,然而,本公开可以通过多种形式来实现,而且不应该被解释为限于这里阐述的实施例,提供这些实施例是为了更加透彻和完整地理解本公开。本公开的附图及实施例仅用于示例性作用。Embodiments of the present disclosure will be described below with reference to the accompanying drawings. Although some embodiments of the present disclosure are shown in the drawings, the present disclosure may, however, be embodied in various forms and should not be construed as limited to the embodiments set forth herein, which are provided for a more thorough and complete understanding this disclosure. The figures and examples of the present disclosure are for illustrative purposes only.
本公开的方法实施方式中记载的多个步骤可以按照不同的顺序执行,和/或 并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。The multiple steps described in the method embodiments of the present disclosure may be performed in different orders, and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this regard.
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。As used herein, the term "including" and variations thereof are open-ended inclusions, ie, "including but not limited to". The term "based on" is "based at least in part on." The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions of other terms will be given in the description below.
本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。Concepts such as "first" and "second" mentioned in the present disclosure are only used to distinguish different devices, modules or units, and are not used to limit the order or interdependence of functions performed by these devices, modules or units relation.
本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有指出,否则应该理解为“一个或多个”。Modifications of "a" and "a plurality" mentioned in the present disclosure are illustrative rather than limiting, and those skilled in the art should understand that unless the context indicates otherwise, they should be construed as "one or more".
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。The names of messages or information exchanged between multiple devices in the embodiments of the present disclosure are only for illustrative purposes, and are not intended to limit the scope of these messages or information.
图1为本公开实施例提供的一种基于多语言机器翻译模型的翻译方法的流程示意图。该方法可以由基于多语言机器翻译模型的翻译装置执行,其中,该装置可以由软件和/或硬件实现,可配置于电子设备中,例如,该装置可以配置在手机、平板电脑或计算机设备中。如图1所示,本实施例提供的基于多语言机器翻译模型的翻译方法可以包括:FIG. 1 is a schematic flowchart of a translation method based on a multilingual machine translation model according to an embodiment of the present disclosure. The method may be performed by a translation apparatus based on a multilingual machine translation model, wherein the apparatus may be implemented by software and/or hardware, and may be configured in an electronic device, for example, the apparatus may be configured in a mobile phone, a tablet computer or a computer device . As shown in FIG. 1, the translation method based on the multilingual machine translation model provided by this embodiment may include:
S101、获取待翻译的原始语句和所述原始语句的翻译语言信息。S101. Acquire an original sentence to be translated and translation language information of the original sentence.
原始语句为本次需要翻译的语句,其可以由用户通过键盘等输入装置输入得到或者通过文字识别或语音识别的方式识别得到,即当用户需要对一语句进行翻译时,可以通过文字输入或语音输入的方式将该语句输入到电子设备中,也可以拍摄包含该语句的图片或获取包含该语句的文本,并将该图片或文本导入到电子设备中。相应的,本实施例所提供的基于多语言机器翻译模型的翻译方法可以对用户输入的文字或语音进行翻译,也可以对用户导入的图片或文本中包含的语句进行翻译,当对用户输入的语音或图片中的语句进行翻译时,可以首先将该语句转换为文字形式的原始语句后再进行翻译,以下以用户通过文字输入的方式输入原始语句为例进行说明。原始语句的翻译语言信息可以理解为本次进行翻译时的翻译方向信息,其可以包括本次翻译的原始语种信息(即待翻译的原始语句所属的语种信息)以及目标语种信息(即需要将原始语句翻译为的目标语句所属的语种信息),该语种信息例如可以为英文、中文或德文等。The original sentence is the sentence that needs to be translated this time, which can be input by the user through an input device such as a keyboard or recognized by text recognition or voice recognition, that is, when the user needs to translate a sentence, it can be input through text or voice. The input method is to input the sentence into the electronic device, and also can take a picture containing the sentence or obtain the text containing the sentence, and import the picture or text into the electronic device. Correspondingly, the translation method based on the multilingual machine translation model provided in this embodiment can translate the text or voice input by the user, and can also translate the sentences contained in the pictures or text imported by the user. When translating a sentence in a voice or picture, the sentence can be converted into an original sentence in text form before translation. The following takes the user inputting the original sentence through text input as an example for description. The translation language information of the original sentence can be understood as the translation direction information during this translation, which can include the original language information of this translation (that is, the language information to which the original sentence to be translated belongs) and the target language information (that is, the original The language information to which the target sentence to which the sentence is translated belongs), the language information may be, for example, English, Chinese, or German.
示例性的,用户在需要对一原始语句进行翻译时,将该原始语句、原始语句所属的原始语种的语种信息以及需要将原始语句翻译为的目标语种的语种信息输入到电子设备中,以生成针对原始语句的翻译指令;相应的,电子设备在接收到针对原始语句的翻译指令时,获取该原始语句,并确定该原始语句的翻译语言信息,如将用户在翻译页面中选择的原始语种的语种信息确定为原始语种信息,并将用户在翻译页面中选择的目标语种的语种信息确定为目标翻译语言信息。Exemplarily, when a user needs to translate an original sentence, the user inputs the original sentence, the language information of the original language to which the original sentence belongs, and the language information of the target language to which the original sentence needs to be translated into the electronic device to generate a The translation instruction for the original sentence; correspondingly, when receiving the translation instruction for the original sentence, the electronic device obtains the original sentence, and determines the translation language information of the original sentence, such as the original language selected by the user on the translation page. The language information is determined as the original language information, and the language information of the target language selected by the user on the translation page is determined as the target translation language information.
S102、确定与所述原始语句的翻译语言信息对应的目标适配器,其中,所述目标适配器用于校正预先设置的多语言机器翻译模型的翻译误差。S102. Determine a target adapter corresponding to the translation language information of the original sentence, where the target adapter is used to correct translation errors of a preset multilingual machine translation model.
在本实施例中,在训练得到多语言机器翻译模型后,可以为该多语言机器翻译模型设置其在不同翻译场景下(即针对不同的翻译语言信息时)的适配器,并在对一翻译场景下的原始语句进行翻译时,采用与该翻译场景对应的适配器校正该多语言机器翻译模型的翻译误差,从而提升多语言机器翻译模型输出的翻译结果的准确性。并且,由于适配器的参数量非常小(参数量不到多语言机器翻译模型的二十分之一,且多语言机器翻译模型越大,该比值越小),因此,通过配置适配器的方式校正多语言机器翻译模型的翻译误差,所增加的参数量极少,便于部署,In this embodiment, after a multilingual machine translation model is obtained by training, adapters for the multilingual machine translation model in different translation scenarios (that is, for different translation language information) can be set, and the multilingual machine translation model can be used for a translation scenario. When translating the original sentence below, the adapter corresponding to the translation scene is used to correct the translation error of the multilingual machine translation model, thereby improving the accuracy of the translation result output by the multilingual machine translation model. Moreover, since the parameter amount of the adapter is very small (the parameter amount is less than one-twentieth of the multilingual machine translation model, and the larger the multilingual machine translation model is, the smaller the ratio is), therefore, by configuring the adapter, correcting more The translation error of the language machine translation model is very small, which is easy to deploy.
电子设备在获取到原始语句的翻译语言信息之后,可以根据该翻译语言信息,自预先设置的多个适配器中,获取与该翻译语言信息对应的适配器,作为目标适配器。After acquiring the translation language information of the original sentence, the electronic device can acquire the adapter corresponding to the translation language information from the preset multiple adapters according to the translation language information as the target adapter.
与一翻译语言信息对应的适配器的数量可以为一个或多个,即本实施例可以仅为每一种翻译语言信息设置一个与其对应的适配器,相应的,当对与该翻译语言信息相符的一原始语句进行翻译时,可以仅采用该适配器校正多语言机器翻译模型的翻译误差;也可以为每一种翻译语言信息设置多个与其对应的适配器,相应的,当对与该翻译语言信息相符的一原始语句进行翻译时,可以采用此多个适配器校正多语言机器翻译模型的翻译误差,从而提高多语言机器翻译模型所输出的翻译结果的准确性,以下以此种情况为例进行说明。在此,当翻译语言信息不同时,所采用的适配器可以不同;适配器的结构可以灵活选取,如适配器中可以包括依次连接的规划层、第一前馈层和第二前馈层,第一前馈层和第二前馈层之间可以配置有激活函数,该激活函数可以为高斯误差线性单元(Gaussian Error Linear Unit,GULU),如图2所示。The number of adapters corresponding to a translation language information may be one or more, that is, in this embodiment, only one adapter corresponding to each translation language information may be set. When the original sentence is translated, only this adapter can be used to correct the translation error of the multilingual machine translation model; it is also possible to set up multiple adapters corresponding to each translation language information. When an original sentence is translated, the multiple adapters can be used to correct the translation errors of the multilingual machine translation model, thereby improving the accuracy of the translation result output by the multilingual machine translation model. The following is an example of this situation. Here, when the translation language information is different, the adapters used can be different; the structure of the adapter can be selected flexibly. An activation function may be configured between the feed layer and the second feedforward layer, and the activation function may be a Gaussian Error Linear Unit (GULU), as shown in Figure 2.
在本实施例中,多语言机器翻译模型的类型可以根据需要设置,如多语言机器翻译模型可以为Transformer模型,如图3所示(图3中仅示例性地给出了一个编码器和一个解码器),该多语言机器翻译模型可以包括至少一个编码器 和至少一个解码器。一实施例中,该多语言机器翻译模型包括多个编码器和多个解码器,如包括6个编码器和6个解码器,每个编码器中至少设置有自注意力层和前馈层两个编码器次层,每个解码器中至少设置有自注意力层、编码解码注意力层和前馈层三个解码器次层,多个编码器串联连接,多个解码器串联连接,最后一个编码器的前馈层与每个解码器的编码解码注意力层连接。In this embodiment, the type of the multilingual machine translation model can be set as required, for example, the multilingual machine translation model can be a Transformer model, as shown in FIG. 3 (in FIG. 3 only one encoder and one decoder), the multilingual machine translation model may include at least one encoder and at least one decoder. In one embodiment, the multilingual machine translation model includes multiple encoders and multiple decoders, such as including 6 encoders and 6 decoders, and each encoder is provided with at least a self-attention layer and a feedforward layer Two encoder sub-layers, each decoder is provided with at least three decoder sub-layers: self-attention layer, encoding-decoding attention layer and feed-forward layer, multiple encoders are connected in series, and multiple decoders are connected in series, The feedforward layer of the last encoder is concatenated with the encoder-decoder attention layer of each decoder.
示例性的,在为多语言机器翻译模型设置其与一翻译语言信息对应的适配器时,可以为语言机器翻译模型中的编码器和解码器分别设置对应的适配器;也可以为语言机器翻译模型的编码器次层和解码器次层分别设置对应的适配器,本实施不对此进行限制。为了在保证多语言机器翻译模型输出的翻译结果的准确性的前提下,减少多语言机器翻译模型所需设置的适配器的数量,本实施例可以将每个编码器中的至少一个编码器次层划分为一个或多个编码器次层组件,将每个解码器中的至少一个解码器次层划分为一个或多个解码器次层组件,并为每个编码器次层组件和每个解码器次层组件分别设置对应的适配器。此时,所述多语言机器翻译模型包括编码器和解码器,所述编码器中包含至少一个编码器次层组件,所述编码器次层组件由至少一个编码器次层构成;所述解码器中包含至少一个解码器次层组件,所述解码器次层组件由至少一个解码器次层构成,每个编码器次层组件和每个解码器次层组件均设置有与不同翻译语言信息对应的不同适配器。其中,不同次层对应的适配器中的参数的值可以不相同;每个编码器中的自注意力层和前馈层可以分别单独构成一个编码器次层组件,每个解码器中的自注意力层和编码解码注意力层可以共同构成一个解码器次层组件,每个解码器中的前馈层可以单独构成一个解码器次层组件,如图4所示(图4仅示例性的给出了一个编码器和一个解码器)。Exemplarily, when setting an adapter corresponding to a translation language information for a multilingual machine translation model, corresponding adapters may be set for the encoder and decoder in the language machine translation model; The encoder sub-layer and the decoder sub-layer are respectively set with corresponding adapters, which are not limited in this implementation. In order to reduce the number of adapters that need to be set for the multi-language machine translation model on the premise of ensuring the accuracy of the translation results output by the multi-language machine translation model, in this embodiment, at least one encoder sub-layer in each encoder may be Divide into one or more encoder sub-layer components, divide at least one decoder sub-layer in each decoder into one or more decoder sub-layer components, and divide for each encoder sub-layer component and each decoder sub-layer component The sub-level components of the device are respectively set with corresponding adapters. At this time, the multilingual machine translation model includes an encoder and a decoder, the encoder includes at least one encoder sub-layer component, and the encoder sub-layer component is composed of at least one encoder sub-layer; the decoding The encoder includes at least one decoder sub-layer component, the decoder sub-layer component is composed of at least one decoder sub-layer, and each encoder sub-layer component and each decoder sub-layer component are provided with different translation language information Corresponding different adapters. Among them, the values of the parameters in the adapters corresponding to different sub-layers can be different; the self-attention layer and the feed-forward layer in each encoder can separately form an encoder sub-layer component, and the self-attention layer in each decoder The force layer and the encoder-decoder attention layer can together form a decoder sub-layer component, and the feedforward layer in each decoder can independently form a decoder sub-layer component, as shown in Figure 4 (Figure 4 is only an exemplary out an encoder and a decoder).
在一个实施方式中,在所述获取待翻译的原始语句和所述原始语句的翻译语言信息之前,还包括:针对每一种翻译语言信息,获取与所述翻译语言信息相符的多个训练样本,并将所述多个训练样本输入至多语言机器翻译模型中,以训练得到与所述翻译语言信息对应的适配器。In one embodiment, before the acquiring the original sentence to be translated and the translated language information of the original sentence, the method further includes: for each translated language information, acquiring a plurality of training samples consistent with the translated language information , and input the plurality of training samples into the multilingual machine translation model, so as to obtain the adapter corresponding to the translated language information by training.
在上述实施方式中,每个次层组件(包括编码器次层组件和解码器次层组件)在不同翻译语言信息下对应的适配器可以通过训练获得。针对每个次层组件在每一种翻译语言信息下对应的适配器,可以在多语言机器翻译模型训练完成后,将多语言机器翻译模型中的参数固定,为多语言机器翻译模型的每个次层组件设置原始的适配器,采用与该翻译语言信息相符的训练样本对每个原始的适配器进行训练,采用测试样本测试多语言机器翻译模型的翻译误差,并将多语言机器翻译模型的翻译误差小于预先设置的误差阈值时的每个次层组件对应的适配器确定为其在该翻译语言信息下对应的适配器。In the above embodiment, the adapters corresponding to each sub-level component (including the encoder sub-level component and the decoder sub-level component) under different translation language information can be obtained through training. For the adapter corresponding to each sub-level component under each translation language information, after the training of the multi-language machine translation model is completed, the parameters in the multi-language machine translation model can be fixed, and the parameters in the multi-language machine translation model can be fixed for each level of the multi-language machine translation model. The layer component sets the original adapter, uses training samples consistent with the translation language information to train each original adapter, uses test samples to test the translation error of the multilingual machine translation model, and sets the translation error of the multilingual machine translation model to less than The adapter corresponding to each sub-level component at the preset error threshold is determined as its corresponding adapter under the translation language information.
S103、基于所述多语言机器翻译模型和所述目标适配器对所述原始语句进行翻译,得到目标语句。S103. Translate the original sentence based on the multilingual machine translation model and the target adapter to obtain a target sentence.
目标语句为对原始语句翻译得到的语句。The target sentence is the sentence translated from the original sentence.
示例性的,在一翻译语言信息下,当多语言机器翻译模型仅配置有一个与多语言机器翻译模型对应的适配器时,可以获取该多语言机器翻译模型输出的数据,并采用该目标适配器对该数据进行校正,得到目标语句;当多语言机器翻译模型中的每个编码器和每个解码器均配置有一个对应的适配器时,可以在一编码器/解码器输出数据之后,采用该编码器/该解码器对应的适配器对该数据进行校正,将校正后的数据输入到下一层中,并将多语言机器翻译模型输出的语句确定为目标语句;当多语言机器翻译模型中的每个编码器和每个解码器中的每个次层组件均配置有一个对应的适配器时,可以在一次层组件输出原始数据之后,采用该次层组件对应的适配器对该原始数据进行校正,将校正后的目标数据输入到下一层中,并将多语言机器翻译模型输出的语句确定为目标语句,所述基于所述多语言机器翻译模型和所述目标适配器对所述原始语句进行翻译,得到目标语句,包括:采用所述多语言机器翻译模型对所述原始语句进行翻译,并采用每个次层组件的第一目标适配器对所述每个次层组件的原始输出数据进行校正,以得到目标语句,所述每个次层组件包括编码器次层组件和/或解码器次层组件。Exemplarily, under a translation language information, when the multilingual machine translation model is only configured with an adapter corresponding to the multilingual machine translation model, the data output by the multilingual machine translation model can be obtained, and the target adapter can be used to The data is corrected to obtain the target sentence; when each encoder and each decoder in the multilingual machine translation model is configured with a corresponding adapter, after an encoder/decoder outputs data, the encoding can be used The adapter corresponding to the decoder/the decoder corrects the data, inputs the corrected data into the next layer, and determines the sentence output by the multilingual machine translation model as the target sentence; When each sub-level component in each encoder and each decoder is configured with a corresponding adapter, after the primary-level component outputs the original data, the adapter corresponding to the sub-level component can be used to correct the original data, and the The corrected target data is input into the next layer, and the sentence output by the multilingual machine translation model is determined as the target sentence, and the original sentence is translated based on the multilingual machine translation model and the target adapter, Obtaining the target sentence includes: using the multilingual machine translation model to translate the original sentence, and using the first target adapter of each sub-level component to correct the original output data of each sub-level component to A target sentence is obtained, each of the sub-level components including an encoder sub-level component and/or a decoder sub-level component.
本实施例提供的基于多语言机器翻译模型的翻译方法,获取待翻译的原始语句和该原始语句的翻译语言信息,确定与该原始语句的翻译语言信息对应的、用于校正预先设置的多语言机器翻译模型的翻译误差的目标适配器,并基于该多语言机器翻译模型和该目标适配器对原始语句进行翻译,得到目标语句。本实施例通过采用上述技术方案,采用适配器校正多语言机器翻译模型的翻译误差,能够提高多语言翻译模型所输出的翻译结果的准确性。The translation method based on the multilingual machine translation model provided by this embodiment acquires the original sentence to be translated and the translation language information of the original sentence, and determines the multilingual information corresponding to the translation language information of the original sentence and is used to correct the preset multi-language The target adapter of the translation error of the machine translation model, and based on the multilingual machine translation model and the target adapter, the original sentence is translated to obtain the target sentence. In this embodiment, by adopting the above technical solution and using an adapter to correct the translation error of the multilingual machine translation model, the accuracy of the translation result output by the multilingual translation model can be improved.
图5为本公开实施例提供的另一种基于多语言机器翻译模型的翻译方法的流程示意图,本实施例中的方案可以与上述实施例中的一个或多个可选方案组合。可选的,所述采用所述多语言机器翻译模型对所述原始语句进行翻译,并采用每个次层组件的第一目标适配器对所述每个次层组件的原始输出数据进行校正,以得到目标语句,包括:依据多个次层组件在多语言机器翻译模型中的连接关系,将所述多语言机器翻译模型中的首个次层组件确定为当前次层组件,并获取所述当前次层组件的目标输入数据;将所述当前次层组件的第一目标适配器确定为当前目标适配器;将所述目标输入数据输入至所述当前次层组件和所述当前目标适配器中,以得到所述当前次层组件的原始输出数据以及所述当 前目标适配器输出的当前校正参数;采用所述当前校正参数对所述原始输出数据进行校正,得到所述当前次层组件的目标输出数据;将所述目标输出数据确定为下一次层组件的目标输入数据,并将所述下一次层组件确定为当前次层组件,返回执行将当前次层组件的目标适配器确定为当前目标适配器的操作,直至不存在下一次层组件为止;当不存在下一次层组件时,将所述当前次层组件的目标输出数据输入至所述当前次层组件的下一层中,以得到目标语句。FIG. 5 is a schematic flowchart of another translation method based on a multilingual machine translation model provided by an embodiment of the present disclosure. The solution in this embodiment may be combined with one or more optional solutions in the foregoing embodiments. Optionally, the multilingual machine translation model is used to translate the original sentence, and the first target adapter of each sub-level component is used to correct the original output data of each sub-level component, so as to Obtaining the target sentence includes: determining the first sub-level component in the multi-language machine translation model as the current sub-level component according to the connection relationship of multiple sub-level components in the multi-language machine translation model, and obtaining the current sub-level component target input data of the sub-level component; determine the first target adapter of the current sub-level component as the current target adapter; input the target input data into the current sub-level component and the current target adapter to obtain the original output data of the current sub-level component and the current correction parameters output by the current target adapter; use the current correction parameters to correct the original output data to obtain the target output data of the current sub-level component; The target output data is determined as the target input data of the next-level component, and the next-level component is determined as the current sub-level component, and the operation of determining the target adapter of the current sub-level component as the current target adapter is returned, until Until there is no next-level component; when there is no next-level component, input the target output data of the current sub-level component into the next layer of the current sub-level component to obtain the target sentence.
相应的,如图5所示,本实施例提供的基于多语言机器翻译模型的翻译方法可以包括:Correspondingly, as shown in FIG. 5 , the translation method based on the multilingual machine translation model provided in this embodiment may include:
S201、获取待翻译的原始语句和所述原始语句的翻译语言信息。S201. Acquire an original sentence to be translated and translation language information of the original sentence.
S202、确定与所述原始语句的翻译语言信息对应的目标适配器,其中,所述目标适配器用于校正预先设置的多语言机器翻译模型的翻译误差。S202. Determine a target adapter corresponding to the translation language information of the original sentence, where the target adapter is used to correct translation errors of a preset multilingual machine translation model.
S203、依据多个次层组件在多语言机器翻译模型中的连接关系,将所述多语言机器翻译模型中的首个次层组件确定为当前次层组件,并获取所述当前次层组件的目标输入数据。S203. According to the connection relationship of multiple sub-level components in the multilingual machine translation model, determine the first sub-level component in the multi-language machine translation model as the current sub-level component, and obtain the current sub-level component. target input data.
目标输入数据可以理解为输入到当前次层组件中的数据。The target input data can be understood as the data input into the current sub-level component.
示例性的,在对原始语句进行翻译时,可以将原始语句输入到多语言机器翻译模型中,并按照多语言机器翻译模型多层之间的连接关系,依次控制每层对前一层输出的信息进行处理,并在获取到多语言机器翻译模型中的首个次层组件的前一层的目标输出数据时,将该首个次层组件确定为当前次层组件,并将其前一层的目标输出数据确定为当前次层组件的目标输入数据。Exemplarily, when translating the original sentence, the original sentence can be input into the multi-language machine translation model, and according to the connection relationship between the multiple layers of the multi-language machine translation model, the output of each layer to the previous layer is controlled in turn. The information is processed, and when the target output data of the previous layer of the first sub-level component in the multilingual machine translation model is obtained, the first sub-level component is determined as the current sub-level component, and its previous layer is determined. The target output data of is determined as the target input data of the current sub-level component.
S204、将所述当前次层组件的第一目标适配器确定为当前目标适配器。S204. Determine the first target adapter of the current sub-layer component as the current target adapter.
第一目标适配器可以理解为次层组件配置的、用于对次层组件的原始输出数据进行校正的适配器,即编码器的编码器次层组件或解码器的解码器次层组件所配置的适配器。在本实施例中,与原始语句的翻译语言信息对应的目标适配器中可以包含有多语言机器翻译模型中的每个次层组件配置的第一目标适配器;还可以包含有多语言机器翻译模型中的词嵌入层(包括输入词嵌入层和输出词嵌入层)配置的第二目标适配器。The first target adapter can be understood as the adapter configured by the sub-level component and used for correcting the original output data of the sub-level component, that is, the adapter configured by the encoder sub-level component of the encoder or the decoder sub-level component of the decoder . In this embodiment, the target adapter corresponding to the translation language information of the original sentence may include the first target adapter configured by each sub-level component in the multilingual machine translation model; it may also include in the multilingual machine translation model The word embedding layer (including the input word embedding layer and the output word embedding layer) is configured with the second target adapter.
可以根据当前次层组件的组件标识信息,自S202所确定的多个目标适配器中,选择与当前次层组件的组件标识信息对应的第一目标适配器,作为当前目标适配器。The first target adapter corresponding to the component identification information of the current sub-level component may be selected from the plurality of target adapters determined in S202 as the current target adapter according to the component identification information of the current sub-level component.
S205、将所述目标输入数据输入至所述当前次层组件和所述当前目标适配器中,以得到所述当前次层组件的原始输出数据以及所述当前目标适配器输出的当前校正参数。S205. Input the target input data into the current sub-level component and the current target adapter, so as to obtain the original output data of the current sub-level component and the current correction parameters output by the current target adapter.
当前次层组件的原始输出数据可以理解为当前次层组件对其目标输入数据进行运算得到的校正前的数据。当前校正参数可以理解为用于对当前次层组件的原始输出数据进行校正的参数,其可以由当前次层组件所配置的当前目标适配器根据当前次层组件的目标输入数据计算得到。The original output data of the current sub-level component can be understood as the data before correction obtained by the current sub-level component operating on its target input data. The current correction parameter can be understood as a parameter for correcting the original output data of the current sub-level component, which can be calculated by the current target adapter configured by the current sub-level component according to the target input data of the current sub-level component.
示例性的,在确定当前次层组件的当前目标适配器之后,可以将当前次层组件的目标输入数据输入至当前次层组件和当前目标适配器中,并获取当前次层组件输出的数据,作为当前次层组件的原始输出数据,以及,获取当前目标适配器输出的数据,作为当前校正参数。Exemplarily, after the current target adapter of the current sub-level component is determined, the target input data of the current sub-level component can be input into the current sub-level component and the current target adapter, and the data output by the current sub-level component can be obtained as the current sub-level component. The raw output data of the sub-level component, and the data output by the current target adapter are obtained as the current calibration parameters.
S206、采用所述当前校正参数对所述原始输出数据进行校正,得到所述当前次层组件的目标输出数据。S206. Correct the original output data by using the current correction parameter to obtain the target output data of the current sub-layer component.
示例性的,可以采用当前校正参数对原始输出数据进行校正,如将原始输出数据校正为原始输出数据与当前校正参数的和,并将对原始输出数据校正得到的数据确定为当前次层组件的目标输出数据。Exemplarily, the current correction parameters can be used to correct the original output data, for example, the original output data is corrected to the sum of the original output data and the current correction parameters, and the data obtained by correcting the original output data is determined as the current sub-layer component. target output data.
S207、判断是否存在下一次层组件,若存在下一次层组件,则执行S208;若不存在下一次层组件,则执行S209。S207: Determine whether there is a next-level component, if there is a next-level component, execute S208; if there is no next-level component, execute S209.
示例性的,可以根据多个次层组件在多语言机器翻译模型中的连接关系,判断当前次层组件的输出端所连接的下一层是否位于一次层组件中,若当前次层组件的输出端所连接的下一层位于一次层组件中,则确定存在下一次层组件,并将该下一层所属的次层组件确定为下一次层组件。Exemplarily, according to the connection relationship of multiple sub-level components in the multilingual machine translation model, it can be judged whether the next layer connected to the output end of the current sub-level component is located in the primary level component, if the output of the current sub-level component is located in the primary level component. If the next layer connected to the terminal is located in the primary layer component, it is determined that there is a next layer component, and the sublayer component to which the next layer belongs is determined as the next layer component.
若当前次层组件的输出端与多个次层的输入端相连时,则可以按照数据在多语言机器翻译模型中的流向,将当前次层组件的输出端所连接的多个次层所位于的次层组件中,数据最先流经的次层组件确定为下一次层组件。If the output terminal of the current sub-layer component is connected to the input terminals of multiple sub-layers, the location of the multiple sub-layers connected to the output terminal of the current sub-layer component can be determined according to the data flow in the multilingual machine translation model. Among the sub-level components of , the sub-level component through which the data first flows is determined as the next-level component.
S208、将所述目标输出数据确定为下一次层组件的目标输入数据,并将所述下一次层组件确定为当前次层组件,返回执行S204。S208: Determine the target output data as the target input data of the next-level component, determine the next-level component as the current sub-level component, and return to executing S204.
S209、将所述当前次层组件的目标输出数据输入至所述当前次层组件的下一层中,以得到目标语句。S209: Input the target output data of the current sub-level component into the next layer of the current sub-level component to obtain a target sentence.
示例性的,如图4所示,当多语言机器翻译模型中的最后一个次层组件的输出端与多语言机器翻译模型的输出层相连时,若当前次层组件不存在下一次层组件,则说明当前次层组件为多语言机器翻译模型中的最后一个次层组件,此时,电子设备在得到当前次层组件的目标输出数据之后,可以将该目标输出数据输出到多语言机器翻译模型的输出层中,从而,多语言机器翻译模型可以通过该输出层输出对原始语句翻译得到的目标语句。Exemplarily, as shown in Figure 4, when the output end of the last sub-level component in the multilingual machine translation model is connected to the output layer of the multi-language machine translation model, if the current sub-level component does not have the next level component, It means that the current sub-level component is the last sub-level component in the multilingual machine translation model. At this time, after obtaining the target output data of the current sub-level component, the electronic device can output the target output data to the multilingual machine translation model. In the output layer, the multilingual machine translation model can output the target sentence obtained by translating the original sentence through the output layer.
在一个实施方式中,所述多语言机器翻译模型还包括输入词嵌入层和输出 词嵌入层,所述输入词嵌入层的输出端与所述多语言机器翻译模型中的首个编码器次层组件的输入端相连,所述输出词嵌入层的输出端与所述多语言机器翻译模型中的首个解码器次层组件的输入端相连。In one embodiment, the multilingual machine translation model further includes an input word embedding layer and an output word embedding layer, and the output of the input word embedding layer is connected to the first encoder sub-layer in the multilingual machine translation model. The input end of the component is connected, and the output end of the output word embedding layer is connected to the input end of the first decoder sub-layer component in the multilingual machine translation model.
在上述实施方式中,如图3所示,多语言机器翻译模型中还可以设置有输入词嵌入层和输出词嵌入层,该输入词嵌入层的输出端可以与多语言机器翻译模型中的第一个编码器的自注意力层的输入端相连,该输出词嵌入层的输出端可以与多语言机器翻译模型中的第一个解码器的自注意力层的输入端相连。In the above-mentioned embodiment, as shown in FIG. 3 , the multilingual machine translation model may also be provided with an input word embedding layer and an output word embedding layer, and the output end of the input word embedding layer may be the same as that of the first word embedding layer in the multilingual machine translation model. The input of the self-attention layer of an encoder is connected, and the output of the output word embedding layer can be connected to the input of the self-attention layer of the first decoder in the multilingual machine translation model.
为了提升多语言机器翻译模型输出的翻译结果的准确性,在上述实施方式中,如图4所示,还可以分别为输入词嵌入层和输出词嵌入层设置对应的适配器,以帮助多语言机器翻译模型的输入词嵌入层和输出词嵌入层更好地建模单词语义。所述基于多语言机器翻译模型的翻译方法还包括:当接收到词嵌入层的原始词嵌入输出数据时,将所述原始词嵌入输出数据输入到所述词嵌入层的第二目标适配器中,并获取所述第二目标适配器输出的词嵌入校正参数,其中,所述词嵌入层为输入词嵌入层或输出词嵌入层;采用所述词嵌入校正参数对所述原始词嵌入输出数据进行校正,得到所述词嵌入层的目标词嵌入输出数据,以将所述目标词嵌入输出数据作为所述词嵌入层连接的次层组件的目标输入数据。In order to improve the accuracy of the translation results output by the multilingual machine translation model, in the above embodiment, as shown in FIG. The input word embedding layer and the output word embedding layer of the translation model better model word semantics. The translation method based on the multilingual machine translation model further comprises: when receiving the original word embedding output data of the word embedding layer, inputting the original word embedding output data into the second target adapter of the word embedding layer, and obtain the word embedding correction parameters output by the second target adapter, wherein the word embedding layer is an input word embedding layer or an output word embedding layer; using the word embedding correction parameters to correct the original word embedding output data , and obtain the target word embedding output data of the word embedding layer, so as to use the target word embedding output data as the target input data of the sub-layer components connected by the word embedding layer.
第二目标适配器可以理解为输入词嵌入层或输出词嵌入层所配置、用于对输入词嵌入层或输出词嵌入层的原始词嵌入输出数据进行校正的适配器。原始词嵌入输出数据可以理解为词嵌入层输出的数据。The second target adapter can be understood as an adapter configured by the input word embedding layer or the output word embedding layer to correct the original word embedding output data of the input word embedding layer or the output word embedding layer. The original word embedding output data can be understood as the data output by the word embedding layer.
示例性的,在获取到输入词嵌入层输出的第一原始词嵌入输出数据后,电子设备可以首先获取与输入词嵌入层的标识信息对应的第二目标适配器,将该第一原始词嵌入输出数据输入到该第二目标适配器中,并获取该第二目标适配器输出的数据,作为词嵌入校正参数;然后采用该词嵌入校正参数对该第一原始词嵌入输出数据进行校正,将校正后的第一原始词嵌入输出数据作为输入词嵌入层的目标词嵌入输出数据,并将该目标词嵌入输出数据输入到该输入词嵌入层的输出端所连接的自注意力层中。在获取到输出词嵌入层输出的第二原始词嵌入输出数据后,电子设备可以首先获取与输出词嵌入层的标识信息对应的第二目标适配器,将该第二原始词嵌入输出数据输入到该第二目标适配器中,并获取该第二目标适配器输出的数据,作为词嵌入校正参数;然后采用该词嵌入校正参数对该第二原始词嵌入输出数据进行校正,将校正后的第二原始词嵌入输出数据作为输出词嵌入层的目标词嵌入输出数据,并将该目标词嵌入输出数据输入到该输出词嵌入层的输出端所连接的自注意力层中。Exemplarily, after obtaining the first original word embedding output data output by the input word embedding layer, the electronic device may first obtain a second target adapter corresponding to the identification information of the input word embedding layer, and output the first original word embedding. The data is input into the second target adapter, and the data output by the second target adapter is obtained as the word embedding correction parameter; then the first original word embedding output data is corrected using the word embedding correction parameter, and the corrected word embedding The first original word embedding output data is used as the target word embedding output data of the input word embedding layer, and the target word embedding output data is input into the self-attention layer connected to the output end of the input word embedding layer. After obtaining the second original word embedding output data output by the output word embedding layer, the electronic device may first obtain a second target adapter corresponding to the identification information of the output word embedding layer, and input the second original word embedding output data into the output word embedding layer. In the second target adapter, and obtain the data output by the second target adapter as the word embedding correction parameter; then use the word embedding correction parameter to correct the second original word embedding output data, and the corrected second original word The embedding output data is used as the target word embedding output data of the output word embedding layer, and the target word embedding output data is input into the self-attention layer connected to the output end of the output word embedding layer.
本实施例提供的基于多语言机器翻译模型的翻译方法,为多语言机器翻译 模型中的每个编码器组件和每个解码器组件均设置与不同翻译语言信息对应的适配器,并在对原始语句进行翻译时,采用每个编码器组件和每个解码器组件所设置的适配器对相应的编码器或解码器输出的数据进行校正,能够在增加较少参数的前提下,提高多语言机器翻译模型的翻译准确度。In the translation method based on the multilingual machine translation model provided by this embodiment, adapters corresponding to different translation language information are set for each encoder component and each decoder component in the multilingual machine translation model, and the original sentence When translating, the adapters set by each encoder component and each decoder component are used to correct the data output by the corresponding encoder or decoder, which can improve the multilingual machine translation model on the premise of adding fewer parameters. translation accuracy.
图6为本公开实施例提供的一种基于多语言机器翻译模型的翻译装置的结构框图。该装置可以由软件和/或硬件实现,可配置于电子设备中,例如,该装置可以配置在手机、平板电脑或计算机设备中,可通过执行基于多语言机器翻译模型的翻译方法进行语句翻译。FIG. 6 is a structural block diagram of a translation apparatus based on a multilingual machine translation model according to an embodiment of the present disclosure. The apparatus can be implemented by software and/or hardware, and can be configured in electronic equipment. For example, the apparatus can be configured in mobile phones, tablet computers, or computer equipment, and can perform sentence translation by executing a translation method based on a multilingual machine translation model.
如图6所示,本实施例提供的基于多语言机器翻译模型的翻译装置可以包括:语句获取模块601、适配器确定模块602和翻译模块603,其中,语句获取模块601,设置为获取待翻译的原始语句和所述原始语句的翻译语言信息;适配器确定模块602,设置为确定与所述原始语句的翻译语言信息对应的目标适配器,其中,所述目标适配器用于校正预先设置的多语言机器翻译模型的翻译误差;翻译模块603,设置为基于所述多语言机器翻译模型和所述目标适配器对所述原始语句进行翻译,得到目标语句。As shown in FIG. 6 , the translation apparatus based on a multilingual machine translation model provided in this embodiment may include: a sentence acquisition module 601, an adapter determination module 602, and a translation module 603, wherein the sentence acquisition module 601 is configured to acquire the to-be-translated The original sentence and the translated language information of the original sentence; the adapter determination module 602 is configured to determine a target adapter corresponding to the translated language information of the original sentence, wherein the target adapter is used to correct a preset multilingual machine translation The translation error of the model; the translation module 603 is configured to translate the original sentence based on the multilingual machine translation model and the target adapter to obtain the target sentence.
本实施例提供的基于多语言机器翻译模型的翻译装置,通过语句获取模块获取待翻译的原始语句和该原始语句的翻译语言信息,通过适配器确定模块确定与该原始语句的翻译语言信息对应的、用于校正预先设置的多语言机器翻译模型的翻译误差的目标适配器,并通过翻译模块基于该多语言机器翻译模型和该目标适配器对原始语句进行翻译,得到目标语句。本实施例通过采用上述技术方案,采用适配器校正多语言机器翻译模型的翻译误差,能够提高多语言翻译模型所输出的翻译结果的准确性。In the translation device based on the multilingual machine translation model provided by this embodiment, the original sentence to be translated and the translated language information of the original sentence are obtained through the sentence acquisition module, and the translation language information corresponding to the translated language information of the original sentence is determined through the adapter determination module. The target adapter is used to correct the translation error of the preset multilingual machine translation model, and the original sentence is translated by the translation module based on the multilingual machine translation model and the target adapter to obtain the target sentence. In this embodiment, by adopting the above technical solution and using an adapter to correct the translation error of the multilingual machine translation model, the accuracy of the translation result output by the multilingual translation model can be improved.
可选的,所述多语言机器翻译模型包括编码器和解码器,所述编码器中包含至少一个编码器次层组件,所述编码器次层组件由至少一个编码器次层构成;所述解码器中包含至少一个解码器次层组件,所述解码器次层组件由至少一个解码器次层构成,每个编码器次层组件和每个解码器次层组件均设置有与不同翻译语言信息对应的不同适配器。Optionally, the multilingual machine translation model includes an encoder and a decoder, the encoder includes at least one encoder sub-layer component, and the encoder sub-layer component is composed of at least one encoder sub-layer; the The decoder includes at least one decoder sub-layer component, the decoder sub-layer component is composed of at least one decoder sub-layer, and each encoder sub-layer component and each decoder sub-layer component are provided with different translation languages. Information corresponds to different adapters.
可选的,所述翻译模块603设置为:采用所述多语言机器翻译模型对所述原始语句进行翻译,并采用每个次层组件的第一目标适配器对所述每个次层组件的原始输出值进行校正,以得到目标语句,所述次层组件包括编码器次层组件和/或解码器次层组件。Optionally, the translation module 603 is configured to: use the multilingual machine translation model to translate the original sentence, and use the first target adapter of each sub-level component to translate the original sentence of each sub-level component. The output value is corrected to obtain the target sentence, and the sub-level components include an encoder sub-level component and/or a decoder sub-level component.
可选的,所述翻译模块603包括:组件确定单元,设置为依据多个次层组 件在多语言机器翻译模型中的连接关系,将所述多语言机器翻译模型中的首个次层组件确定为当前次层组件,并获取所述当前次层组件的目标输入数据;适配器获取单元,设置为将所述当前次层组件的第一目标适配器确定为当前目标适配器;参数确定单元,设置为将所述目标输入数据输入至所述当前次层组件和所述当前目标适配器中,以得到所述当前次层组件的原始输出数据以及所述当前目标适配器输出的当前校正参数;校正单元,设置为采用所述当前校正参数对所述原始输出数据进行校正,得到所述当前次层组件的目标输出数据;调用单元,设置为将所述目标输出数据确定为下一次层组件的目标输入数据,并将所述下一次层组件确定为当前次层组件,返回调用所述适配器获取单元,直至不存在下一次层组件为止;输入单元,设置为在不存在下一次层组件时,将所述当前次层组件的目标输出数据输入至所述当前次层组件的下一层中,以得到目标语句。Optionally, the translation module 603 includes: a component determination unit, configured to determine the first sub-level component in the multi-language machine translation model according to the connection relationship of multiple sub-level components in the multi-language machine translation model. is the current sub-level component, and acquires the target input data of the current sub-level component; the adapter acquisition unit is set to determine the first target adapter of the current sub-level component as the current target adapter; the parameter determination unit is set to The target input data is input into the current sub-level component and the current target adapter, so as to obtain the original output data of the current sub-level component and the current correction parameters output by the current target adapter; the correction unit is set to The original output data is corrected by using the current correction parameters to obtain the target output data of the current sub-level component; the calling unit is configured to determine the target output data as the target input data of the next-level component, and Determine the next-level component as the current-level component, and return to call the adapter acquisition unit until there is no next-level component; the input unit is set to be the current-level component when there is no next-level component. The target output data of the layer component is input into the next layer of the current sub-layer component to obtain the target sentence.
在上述方案中,所述多语言机器翻译模型还包括输入词嵌入层和输出词嵌入层,所述输入词嵌入层的输出端与所述多语言机器翻译模型中的首个编码器次层组件的输入端相连,所述输出词嵌入层的输出端与所述多语言机器翻译模型中的首个解码器次层组件的输入端相连。In the above solution, the multilingual machine translation model further includes an input word embedding layer and an output word embedding layer, the output end of the input word embedding layer and the first encoder sub-layer component in the multilingual machine translation model The input end of the output word embedding layer is connected to the input end of the first decoder sub-layer component in the multilingual machine translation model.
可选的,本实施例提供的基于多语言机器翻译模型的翻译装置还包括:适配器输入模块,设置为在接收到词嵌入层的原始词嵌入输出数据时,将所述原始词嵌入输出数据输入到所述词嵌入层的第二目标适配器中,并获取所述第二目标适配器输出的词嵌入校正参数,其中,所述词嵌入层为输入词嵌入层或输出词嵌入层;嵌入层校正模块,设置为采用所述词嵌入校正参数对所述原始词嵌入输出数据进行校正,得到所述词嵌入层的目标词嵌入输出数据,以将所述目标词嵌入输出数据作为所述词嵌入层连接的次层组件的目标输入数据。Optionally, the translation device based on the multi-language machine translation model provided by this embodiment further includes: an adapter input module, configured to input the original word embedding output data when receiving the original word embedding output data of the word embedding layer. into the second target adapter of the word embedding layer, and obtain the word embedding correction parameters output by the second target adapter, wherein the word embedding layer is the input word embedding layer or the output word embedding layer; the embedding layer correction module , set to use the word embedding correction parameters to correct the original word embedding output data to obtain the target word embedding output data of the word embedding layer, so as to connect the target word embedding output data as the word embedding layer The target input data of the sub-level component.
可选的,本实施例提供的基于多语言机器翻译模型的翻译装置还包括:适配器训练模块,设置为在所述获取待翻译的原始语句和所述原始语句的翻译语言信息之前,针对每一种翻译语言信息,获取与所述翻译语言信息相符的多个训练样本,并将所述多个训练样本输入至多语言机器翻译模型中,以训练得到与所述翻译语言信息对应的适配器。Optionally, the translation apparatus based on the multilingual machine translation model provided in this embodiment further includes: an adapter training module, configured to, before the acquisition of the original sentence to be translated and the translation language information of the original sentence, for each A translation language information is obtained, a plurality of training samples consistent with the translated language information are obtained, and the multiple training samples are input into a multilingual machine translation model, so as to obtain an adapter corresponding to the translated language information by training.
本公开实施例提供的基于多语言机器翻译模型的翻译装置可执行本公开任意实施例提供的基于多语言机器翻译模型的翻译方法,具备执行基于多语言机器翻译模型的翻译方法相应的功能模块和效果。未在本实施例中详尽描述的技术细节,可参见本公开任意实施例所提供的基于多语言机器翻译模型的翻译方法。The translation apparatus based on the multilingual machine translation model provided by the embodiment of the present disclosure can execute the translation method based on the multilingual machine translation model provided by any embodiment of the present disclosure, and has corresponding functional modules for executing the translation method based on the multilingual machine translation model and Effect. For technical details not described in detail in this embodiment, reference may be made to the translation method based on a multilingual machine translation model provided by any embodiment of the present disclosure.
下面参考图7,其示出了适于用来实现本公开实施例的电子设备(例如终端设备)700的结构示意图。本公开实施例中的终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,PDA)、平板电脑(PAD)、便携式多媒体播放器(Portable Media Player,PMP)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字(Television,TV)、台式计算机等等的固定终端。图7示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。Referring next to FIG. 7 , it shows a schematic structural diagram of an electronic device (eg, a terminal device) 700 suitable for implementing an embodiment of the present disclosure. Terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistants, PDAs), tablet computers (PADs), and portable multimedia players (Portable Media Players). , PMP), in-vehicle terminals (eg, in-vehicle navigation terminals), etc., and stationary terminals such as digital (Television, TV), desktop computers, and the like. The electronic device shown in FIG. 7 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
如图7所示,电子设备700可以包括处理装置(例如中央处理器、图形处理器等)701,其可以根据存储在只读存储器(Read-Only Memory,ROM)702中的程序或者从存储装置708加载到随机访问存储器(Random Access Memory,RAM)703中的程序而执行多种适当的动作和处理。在RAM 703中,还存储有电子设备700操作所需的多种程序和数据。处理装置701、ROM 702以及RAM 703通过总线704彼此相连。输入/输出(Input/Output,I/O)接口705也连接至总线704。As shown in FIG. 7 , the electronic device 700 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 701, which may be based on a program stored in a read-only memory (Read-Only Memory, ROM) 702 or from a storage device 708 programs loaded into Random Access Memory (RAM) 703 to perform various appropriate actions and processes. In the RAM 703, various programs and data required for the operation of the electronic device 700 are also stored. The processing device 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. An Input/Output (I/O) interface 705 is also connected to the bus 704 .
通常,以下装置可以连接至I/O接口705:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置706;包括例如液晶显示器(Liquid Crystal Display,LCD)、扬声器、振动器等的输出装置707;包括例如磁带、硬盘等的存储装置708;以及通信装置709。通信装置709可以允许电子设备700与其他设备进行无线或有线通信以交换数据。虽然图7示出了具有多种装置的电子设备700,但是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。Typically, the following devices can be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a Liquid Crystal Display (LCD) Output device 707 , speaker, vibrator, etc.; storage device 708 , including, for example, magnetic tape, hard disk, etc.; and communication device 709 . Communication means 709 may allow electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. Although FIG. 7 shows an electronic device 700 having various means, it is not required to implement or have all of the illustrated means. More or fewer devices may alternatively be implemented or provided.
根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在非暂态计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置709从网络上被下载和安装,或者从存储装置708被安装,或者从ROM 702被安装。在该计算机程序被处理装置701执行时,执行本公开实施例的方法中限定的上述功能。According to embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network via the communication device 709, or from the storage device 708, or from the ROM 702. When the computer program is executed by the processing device 701, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、RAM、ROM、可擦式可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM或闪存)、光纤、 便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。The computer-readable medium described above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. Examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, RAM, ROM, Erasable Programmable Read-Only Memory (EPROM) or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In this disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present disclosure, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device . The program code embodied on the computer readable medium can be transmitted by any suitable medium, including but not limited to: electric wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the above.
在一些实施方式中,客户端、服务器可以利用诸如超文本传输协议(HyperText Transfer Protocol,HTTP)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(Local Area Network,LAN),广域网(Wide Area Network,WAN),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。In some embodiments, clients and servers can communicate using any currently known or future developed network protocols, such as HyperText Transfer Protocol (HTTP), and can communicate with digital data in any form or medium. Communication (eg, a communication network) interconnects. Examples of communication networks include Local Area Networks (LANs), Wide Area Networks (WANs), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently Known or future developed networks.
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。The above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:获取待翻译的原始语句和所述原始语句的翻译语言信息;确定与所述原始语句的翻译语言信息对应的目标适配器,其中,所述目标适配器用于校正预先设置的多语言机器翻译模型的翻译误差;基于所述多语言机器翻译模型和所述目标适配器对所述原始语句进行翻译,得到目标语句。The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: obtains the original sentence to be translated and the translation language information of the original sentence; The target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct the translation error of a preset multilingual machine translation model; based on the multilingual machine translation model and the target adapter, the The original sentence is translated to obtain the target sentence.
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括LAN或WAN—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商 来通过因特网连接)。Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a LAN or WAN, or, alternatively, may be connected to an external computer (eg, using an Internet service provider to connect through the Internet).
附图中的流程图和框图,图示了按照本公开多种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,模块的名称在一种情况下并不构成对该单元本身的限定。The units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner. In this case, the name of the module does not constitute a limitation on the unit itself.
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(Field Programmable Gate Array,FPGA)、专用集成电路(Application Specific Integrated Circuit,ASIC)、专用标准产品(Application Specific Standard Parts,ASSP)、片上系统(System on Chip,SOC)、复杂可编程逻辑设备(Complex Programmable Logic Device,CPLD)等等。The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (Application Specific Standard Products) Standard Parts, ASSP), system on chip (System on Chip, SOC), complex programmable logic device (Complex Programmable Logic Device, CPLD) and so on.
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、RAM、ROM、EPROM或快闪存储器、光纤、CD-ROM、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing. Examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, RAM, ROM, EPROM or flash memory, optical fibers, CD-ROMs, optical storage devices, magnetic storage devices, or Any suitable combination of the above.
根据本公开的一个或多个实施例,示例1提供了一种基于多语言机器翻译模型的翻译方法,包括:According to one or more embodiments of the present disclosure, Example 1 provides a translation method based on a multilingual machine translation model, including:
获取待翻译的原始语句和所述原始语句的翻译语言信息;确定与所述原始语句的翻译语言信息对应的目标适配器,其中,所述目标适配器用于校正预先 设置的多语言机器翻译模型的翻译误差;基于所述多语言机器翻译模型和所述目标适配器对所述原始语句进行翻译,得到目标语句。Obtaining the original sentence to be translated and the translation language information of the original sentence; determining a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct the translation of a preset multilingual machine translation model error; translate the original sentence based on the multilingual machine translation model and the target adapter to obtain a target sentence.
根据本公开的一个或多个实施例,示例2根据示例1所述的方法,所述多语言机器翻译模型包括编码器和解码器,所述编码器中包含至少一个编码器次层组件,所述编码器次层组件由至少一个编码器次层构成;所述解码器中包含至少一个解码器次层组件,所述解码器次层组件由至少一个解码器次层构成,每个编码器次层组件和每个解码器次层组件均设置有与不同翻译语言信息对应的不同适配器。According to one or more embodiments of the present disclosure, Example 2 According to the method of Example 1, the multilingual machine translation model includes an encoder and a decoder, and the encoder includes at least one encoder sub-level component, so The encoder sub-layer component is composed of at least one encoder sub-layer; the decoder includes at least one decoder sub-layer component, and the decoder sub-layer component is composed of at least one decoder sub-layer, each encoder sub-layer The layer component and each decoder sub-layer component are provided with different adapters corresponding to different translated language information.
根据本公开的一个或多个实施例,示例3根据示例2所述的方法,所述基于所述多语言机器翻译模型和所述目标适配器对所述原始语句进行翻译,得到目标语句,包括:According to one or more embodiments of the present disclosure, Example 3. According to the method of Example 2, the translation of the original sentence based on the multilingual machine translation model and the target adapter to obtain the target sentence includes:
采用所述多语言机器翻译模型对所述原始语句进行翻译,并采用每个次层组件的第一目标适配器对所述每个次层组件的原始输出值进行校正,以得到所述目标语句,所述每个次层组件包括编码器次层组件和/或解码器次层组件。Use the multilingual machine translation model to translate the original sentence, and use the first target adapter of each sub-level component to correct the original output value of each sub-level component to obtain the target sentence, Each of the sub-layer components includes an encoder sub-layer component and/or a decoder sub-layer component.
根据本公开的一个或多个实施例,示例4根据示例3所述的方法,所述采用所述多语言机器翻译模型对所述原始语句进行翻译,并采用每个次层组件的第一目标适配器对所述每个次层组件的原始输出值进行校正,以得到所述目标语句,包括:According to one or more embodiments of the present disclosure, Example 4 According to the method of Example 3, the original sentence is translated using the multilingual machine translation model, and the first target of each sub-level component is adopted The adapter corrects the raw output value of each sub-level component to obtain the target statement, including:
依据多个次层组件在所述多语言机器翻译模型中的连接关系,将所述多语言机器翻译模型中的首个次层组件确定为当前次层组件,并获取所述当前次层组件的目标输入数据;将所述当前次层组件的第一目标适配器确定为当前目标适配器;将所述目标输入数据输入至所述当前次层组件和所述当前目标适配器中,以得到所述当前次层组件的原始输出数据以及所述当前目标适配器输出的当前校正参数;采用所述当前校正参数对所述原始输出数据进行校正,得到所述当前次层组件的目标输出数据;将所述目标输出数据确定为下一次层组件的目标输入数据,并将所述下一次层组件确定为当前次层组件,返回执行所述将所述当前次层组件的第一目标适配器确定为当前目标适配器的操作,直至不存在下一次层组件为止;当不存在下一次层组件时,将所述当前次层组件的目标输出数据输入至所述当前次层组件的下一层中,以得到所述目标语句。According to the connection relationship of multiple sub-level components in the multilingual machine translation model, determine the first sub-level component in the multi-language machine translation model as the current sub-level component, and obtain the current sub-level component. target input data; determine the first target adapter of the current sub-level component as the current target adapter; input the target input data into the current sub-level component and the current target adapter to obtain the current sub-level component The original output data of the layer component and the current correction parameters output by the current target adapter; use the current correction parameters to correct the original output data to obtain the target output data of the current sub-layer component; output the target The data is determined as the target input data of the next-level component, and the next-level component is determined as the current sub-level component, and the operation of determining the first target adapter of the current sub-level component as the current target adapter is returned to be executed , until there is no next-level component; when there is no next-level component, input the target output data of the current sub-level component into the next layer of the current sub-level component to obtain the target statement .
根据本公开的一个或多个实施例,示例5根据示例3或4所述的方法,所述多语言机器翻译模型还包括输入词嵌入层和输出词嵌入层,所述输入词嵌入层的输出端与所述多语言机器翻译模型中的首个编码器次层组件的输入端相连,所述输出词嵌入层的输出端与所述多语言机器翻译模型中的首个解码器次层组件的输入端相连。According to one or more embodiments of the present disclosure, Example 5 According to the method of Example 3 or 4, the multilingual machine translation model further includes an input word embedding layer and an output word embedding layer, the output of the input word embedding layer The terminal is connected to the input terminal of the first encoder sub-layer component in the multilingual machine translation model, and the output terminal of the output word embedding layer is connected to the input terminal of the first decoder sub-layer component in the multilingual machine translation model. connected to the input.
根据本公开的一个或多个实施例,示例6根据示例5所述的方法,还包括:According to one or more embodiments of the present disclosure, Example 6 is the method of Example 5, further comprising:
当接收到词嵌入层的原始词嵌入输出数据时,将所述原始词嵌入输出数据输入到所述词嵌入层的第二目标适配器中,并获取所述第二目标适配器输出的词嵌入校正参数,其中,所述词嵌入层为所述输入词嵌入层或所述输出词嵌入层;采用所述词嵌入校正参数对所述原始词嵌入输出数据进行校正,得到所述词嵌入层的目标词嵌入输出数据,以将所述目标词嵌入输出数据作为所述词嵌入层连接的次层组件的目标输入数据。When receiving the original word embedding output data of the word embedding layer, input the original word embedding output data into the second target adapter of the word embedding layer, and obtain the word embedding correction parameters output by the second target adapter , wherein the word embedding layer is the input word embedding layer or the output word embedding layer; the original word embedding output data is corrected by using the word embedding correction parameter to obtain the target word of the word embedding layer Embedding output data to use the target word embedding output data as target input data for sub-layer components connected by the word embedding layer.
根据本公开的一个或多个实施例,示例7根据示例1-4任一所述的方法,在所述获取待翻译的原始语句和所述原始语句的翻译语言信息之前,还包括:According to one or more embodiments of the present disclosure, Example 7, according to the method described in any one of Examples 1-4, before the acquiring the original sentence to be translated and the translation language information of the original sentence, further includes:
针对每一种翻译语言信息,获取与所述翻译语言信息相符的多个训练样本,并将所述多个训练样本输入至所述多语言机器翻译模型中,以训练得到与所述每一种翻译语言信息对应的适配器。For each kind of translated language information, obtain multiple training samples that match the translated language information, and input the multiple training samples into the multilingual machine translation model, so as to obtain training samples that match the translation language information. The adapter corresponding to the translated language information.
根据本公开的一个或多个实施例,示例8提供了一种基于多语言机器翻译模型的翻译装置,包括:According to one or more embodiments of the present disclosure, Example 8 provides a translation apparatus based on a multilingual machine translation model, including:
语句获取模块,设置为获取待翻译的原始语句和所述原始语句的翻译语言信息;适配器确定模块,设置为确定与所述原始语句的翻译语言信息对应的目标适配器,其中,所述目标适配器用于校正预先设置的多语言机器翻译模型的翻译误差;翻译模块,设置为基于所述多语言机器翻译模型和所述目标适配器对所述原始语句进行翻译,得到目标语句。A statement acquisition module, configured to acquire the original statement to be translated and the translation language information of the original statement; an adapter determination module, configured to determine a target adapter corresponding to the translated language information of the original statement, wherein the target adapter uses for correcting the translation error of the preset multilingual machine translation model; the translation module is configured to translate the original sentence based on the multilingual machine translation model and the target adapter to obtain the target sentence.
根据本公开的一个或多个实施例,示例9提供了一种电子设备,包括:According to one or more embodiments of the present disclosure, Example 9 provides an electronic device, comprising:
一个或多个处理器;存储器,设置为存储一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如示例1-7中任一所述的基于多语言机器翻译模型的翻译方法。one or more processors; a memory configured to store one or more programs; when the one or more programs are executed by the one or more processors, causing the one or more processors to implement as in Example 1 The translation method based on a multilingual machine translation model described in any one of -7.
根据本公开的一个或多个实施例,示例10提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如示例1-7中任一所述的基于多语言机器翻译模型的翻译方法。According to one or more embodiments of the present disclosure, Example 10 provides a computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements the method based on any of Examples 1-7. Translation methods for multilingual machine translation models.
此外,虽然采用特定次序描绘了多个操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了多个实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的一些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的多种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。Additionally, although operations are depicted in a particular order, this should not be construed as requiring that the operations be performed in the particular order shown or in a sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while the above discussion contains several implementation details, these should not be construed as limitations on the scope of the present disclosure. Some features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Claims (10)

  1. 一种基于多语言机器翻译模型的翻译方法,包括:A translation method based on a multilingual machine translation model, comprising:
    获取待翻译的原始语句和所述原始语句的翻译语言信息;obtaining the original sentence to be translated and the translated language information of the original sentence;
    确定与所述原始语句的翻译语言信息对应的目标适配器,其中,所述目标适配器用于校正预先设置的多语言机器翻译模型的翻译误差;determining a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct translation errors of a preset multilingual machine translation model;
    基于所述多语言机器翻译模型和所述目标适配器对所述原始语句进行翻译,得到目标语句。The original sentence is translated based on the multilingual machine translation model and the target adapter to obtain a target sentence.
  2. 根据权利要求1所述的方法,其中,所述多语言机器翻译模型包括编码器和解码器,所述编码器中包含至少一个编码器次层组件,所述编码器次层组件由至少一个编码器次层构成;所述解码器中包含至少一个解码器次层组件,所述解码器次层组件由至少一个解码器次层构成,每个编码器次层组件和每个解码器次层组件均设置有与不同翻译语言信息对应的不同适配器。The method of claim 1, wherein the multilingual machine translation model includes an encoder and a decoder, the encoder including at least one encoder sub-level component, the encoder sub-level component encoded by at least one The decoder includes at least one decoder sub-layer component, the decoder sub-layer component is composed of at least one decoder sub-layer, each encoder sub-layer component and each decoder sub-layer component Each is provided with different adapters corresponding to different translation language information.
  3. 根据权利要求2所述的方法,其中,所述基于所述多语言机器翻译模型和所述目标适配器对所述原始语句进行翻译,得到目标语句,包括:The method according to claim 2, wherein the translation of the original sentence based on the multilingual machine translation model and the target adapter to obtain the target sentence comprises:
    采用所述多语言机器翻译模型对所述原始语句进行翻译,并采用每个次层组件的第一目标适配器对所述每个次层组件的原始输出值进行校正,以得到所述目标语句,所述每个次层组件包括编码器次层组件和解码器次层组件中的至少之一。Use the multilingual machine translation model to translate the original sentence, and use the first target adapter of each sub-level component to correct the original output value of each sub-level component to obtain the target sentence, Each of the sub-layer components includes at least one of an encoder sub-layer component and a decoder sub-layer component.
  4. 根据权利要求3所述的方法,其中,所述采用所述多语言机器翻译模型对所述原始语句进行翻译,并采用每个次层组件的第一目标适配器对所述每个次层组件的原始输出值进行校正,以得到所述目标语句,包括:The method according to claim 3, wherein the original sentence is translated by using the multi-language machine translation model, and the first target adapter of each sub-level component is used to translate the translation of each sub-level component. The raw output values are corrected to obtain the target sentence, including:
    依据多个次层组件在所述多语言机器翻译模型中的连接关系,将所述多语言机器翻译模型中的首个次层组件确定为当前次层组件,并获取所述当前次层组件的目标输入数据;According to the connection relationship of multiple sub-level components in the multilingual machine translation model, determine the first sub-level component in the multi-language machine translation model as the current sub-level component, and obtain the current sub-level component. target input data;
    将所述当前次层组件的第一目标适配器确定为当前目标适配器;determining the first target adapter of the current sub-level component as the current target adapter;
    将所述目标输入数据输入至所述当前次层组件和所述当前目标适配器中,以得到所述当前次层组件的原始输出数据以及所述当前目标适配器输出的当前校正参数;inputting the target input data into the current sub-level component and the current target adapter to obtain the original output data of the current sub-level component and the current correction parameters output by the current target adapter;
    采用所述当前校正参数对所述原始输出数据进行校正,得到所述当前次层组件的目标输出数据;Correcting the original output data using the current correction parameters to obtain the target output data of the current sub-level component;
    将所述目标输出数据确定为下一次层组件的目标输入数据,并将所述下一次层组件确定为当前次层组件,返回执行所述将所述当前次层组件的第一目标 适配器确定为当前目标适配器的操作,直至不存在下一次层组件为止;Determining the target output data as the target input data of the next-level component, and determining the next-level component as the current sub-level component, and returning to execute the process of determining the first target adapter of the current sub-level component as The operation of the current target adapter until there is no next layer component;
    在不存在下一次层组件的情况下,将所述当前次层组件的目标输出数据输入至所述当前次层组件的下一层中,以得到所述目标语句。In the case where there is no next-level component, the target output data of the current sub-level component is input into the next layer of the current sub-level component, so as to obtain the target sentence.
  5. 根据权利要求3或4所述的方法,其中,所述多语言机器翻译模型还包括输入词嵌入层和输出词嵌入层,所述输入词嵌入层的输出端与所述多语言机器翻译模型中的首个编码器次层组件的输入端相连,所述输出词嵌入层的输出端与所述多语言机器翻译模型中的首个解码器次层组件的输入端相连。The method according to claim 3 or 4, wherein the multilingual machine translation model further comprises an input word embedding layer and an output word embedding layer, and the output of the input word embedding layer is the same as that in the multilingual machine translation model. The input end of the first encoder sub-layer component is connected to the input end of the first encoder sub-layer component, and the output end of the output word embedding layer is connected to the input end of the first decoder sub-layer component in the multilingual machine translation model.
  6. 根据权利要求5所述的方法,还包括:The method of claim 5, further comprising:
    在接收到词嵌入层的原始词嵌入输出数据的情况下,将所述原始词嵌入输出数据输入到所述词嵌入层的第二目标适配器中,并获取所述第二目标适配器输出的词嵌入校正参数,其中,所述词嵌入层为所述输入词嵌入层或所述输出词嵌入层;In the case of receiving the original word embedding output data of the word embedding layer, input the original word embedding output data into the second target adapter of the word embedding layer, and obtain the word embedding output by the second target adapter Correction parameters, wherein the word embedding layer is the input word embedding layer or the output word embedding layer;
    采用所述词嵌入校正参数对所述原始词嵌入输出数据进行校正,得到所述词嵌入层的目标词嵌入输出数据,以将所述目标词嵌入输出数据作为所述词嵌入层连接的次层组件的目标输入数据。The original word embedding output data is corrected using the word embedding correction parameters to obtain the target word embedding output data of the word embedding layer, so as to use the target word embedding output data as a sub-layer connected by the word embedding layer The component's target input data.
  7. 根据权利要求1-4中任一项所述的方法,在所述获取待翻译的原始语句和所述原始语句的翻译语言信息之前,还包括:The method according to any one of claims 1-4, before acquiring the original sentence to be translated and the translation language information of the original sentence, further comprising:
    针对每一种翻译语言信息,获取与所述翻译语言信息相符的多个训练样本,并将所述多个训练样本输入至所述多语言机器翻译模型中,以训练得到与所述每一种翻译语言信息对应的适配器。For each kind of translated language information, obtain multiple training samples that match the translated language information, and input the multiple training samples into the multilingual machine translation model, so as to obtain training samples that match the translation language information. The adapter corresponding to the translated language information.
  8. 一种基于多语言机器翻译模型的翻译装置,,包括:A translation device based on a multilingual machine translation model, comprising:
    语句获取模块,设置为获取待翻译的原始语句和所述原始语句的翻译语言信息;A statement acquisition module, configured to acquire the original statement to be translated and the translation language information of the original statement;
    适配器确定模块,设置为确定与所述原始语句的翻译语言信息对应的目标适配器,其中,所述目标适配器用于校正预先设置的多语言机器翻译模型的翻译误差;an adapter determination module, configured to determine a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct translation errors of a preset multilingual machine translation model;
    翻译模块,设置为基于所述多语言机器翻译模型和所述目标适配器对所述原始语句进行翻译,得到目标语句。A translation module, configured to translate the original sentence based on the multilingual machine translation model and the target adapter to obtain a target sentence.
  9. 一种电子设备,包括:An electronic device comprising:
    至少一个处理器;at least one processor;
    存储器,设置为存储至少一个程序;a memory, arranged to store at least one program;
    当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-7中任一项所述的基于多语言机器翻译模型的翻译方法。When the at least one program is executed by the at least one processor, the at least one processor implements the translation method based on a multilingual machine translation model according to any one of claims 1-7.
  10. 一种计算机可读存储介质,存储有计算机程序,其中,所述程序被处理器执行时实现如权利要求1-7中任一项所述的基于多语言机器翻译模型的翻译方法。A computer-readable storage medium storing a computer program, wherein when the program is executed by a processor, the translation method based on a multilingual machine translation model according to any one of claims 1-7 is implemented.
PCT/CN2021/131090 2020-12-04 2021-11-17 Translation method and apparatus employing multi-language machine translation model, device, and medium WO2022116821A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011409340.1A CN112380876A (en) 2020-12-04 2020-12-04 Translation method, device, equipment and medium based on multi-language machine translation model
CN202011409340.1 2020-12-04

Publications (1)

Publication Number Publication Date
WO2022116821A1 true WO2022116821A1 (en) 2022-06-09

Family

ID=74590507

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/131090 WO2022116821A1 (en) 2020-12-04 2021-11-17 Translation method and apparatus employing multi-language machine translation model, device, and medium

Country Status (2)

Country Link
CN (1) CN112380876A (en)
WO (1) WO2022116821A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115438678A (en) * 2022-11-08 2022-12-06 苏州浪潮智能科技有限公司 Machine translation method, device, electronic equipment and storage medium
CN115688815A (en) * 2022-12-30 2023-02-03 北京澜舟科技有限公司 Multilingual translation model construction method and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380876A (en) * 2020-12-04 2021-02-19 北京有竹居网络技术有限公司 Translation method, device, equipment and medium based on multi-language machine translation model

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543824A (en) * 2018-11-30 2019-03-29 腾讯科技(深圳)有限公司 A kind for the treatment of method and apparatus of series model
US20200034436A1 (en) * 2018-07-26 2020-01-30 Google Llc Machine translation using neural network models
CN110852116A (en) * 2019-11-07 2020-02-28 腾讯科技(深圳)有限公司 Non-autoregressive neural machine translation method, device, computer equipment and medium
CN111178093A (en) * 2019-12-20 2020-05-19 沈阳雅译网络技术有限公司 Neural machine translation system training acceleration method based on stacking algorithm
CN111222347A (en) * 2020-04-15 2020-06-02 北京金山数字娱乐科技有限公司 Sentence translation model training method and device and sentence translation method and device
CN111814493A (en) * 2020-04-21 2020-10-23 北京嘀嘀无限科技发展有限公司 Machine translation method, device, electronic equipment and storage medium
CN111859927A (en) * 2020-06-01 2020-10-30 北京先声智能科技有限公司 Grammar error correction model based on attention sharing Transformer
CN112380876A (en) * 2020-12-04 2021-02-19 北京有竹居网络技术有限公司 Translation method, device, equipment and medium based on multi-language machine translation model

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7039625B2 (en) * 2002-11-22 2006-05-02 International Business Machines Corporation International information search and delivery system providing search results personalized to a particular natural language
US8880770B2 (en) * 2012-06-07 2014-11-04 Apple Inc. Protocol translating adapter
US10909331B2 (en) * 2018-03-30 2021-02-02 Apple Inc. Implicit identification of translation payload with neural machine translation
CN111460838B (en) * 2020-04-23 2023-09-22 腾讯科技(深圳)有限公司 Pre-training method, device and storage medium of intelligent translation model

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200034436A1 (en) * 2018-07-26 2020-01-30 Google Llc Machine translation using neural network models
CN109543824A (en) * 2018-11-30 2019-03-29 腾讯科技(深圳)有限公司 A kind for the treatment of method and apparatus of series model
CN110852116A (en) * 2019-11-07 2020-02-28 腾讯科技(深圳)有限公司 Non-autoregressive neural machine translation method, device, computer equipment and medium
CN111178093A (en) * 2019-12-20 2020-05-19 沈阳雅译网络技术有限公司 Neural machine translation system training acceleration method based on stacking algorithm
CN111222347A (en) * 2020-04-15 2020-06-02 北京金山数字娱乐科技有限公司 Sentence translation model training method and device and sentence translation method and device
CN111814493A (en) * 2020-04-21 2020-10-23 北京嘀嘀无限科技发展有限公司 Machine translation method, device, electronic equipment and storage medium
CN111859927A (en) * 2020-06-01 2020-10-30 北京先声智能科技有限公司 Grammar error correction model based on attention sharing Transformer
CN112380876A (en) * 2020-12-04 2021-02-19 北京有竹居网络技术有限公司 Translation method, device, equipment and medium based on multi-language machine translation model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
NEIL HOULSBY; ANDREI GIURGIU; STANISLAW JASTRZEBSKI; BRUNA MORRONE; QUENTIN DE LAROUSSILHE; ANDREA GESMUNDO; MONA ATTARIYAN; SYLVA: "Parameter-Efficient Transfer Learning for NLP", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 2 February 2019 (2019-02-02), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081024612 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115438678A (en) * 2022-11-08 2022-12-06 苏州浪潮智能科技有限公司 Machine translation method, device, electronic equipment and storage medium
CN115688815A (en) * 2022-12-30 2023-02-03 北京澜舟科技有限公司 Multilingual translation model construction method and storage medium
CN115688815B (en) * 2022-12-30 2023-03-31 北京澜舟科技有限公司 Multilingual translation model construction method and storage medium

Also Published As

Publication number Publication date
CN112380876A (en) 2021-02-19

Similar Documents

Publication Publication Date Title
WO2022116821A1 (en) Translation method and apparatus employing multi-language machine translation model, device, and medium
WO2022042512A1 (en) Text processing method and apparatus, electronic device, and medium
CN111046677B (en) Method, device, equipment and storage medium for obtaining translation model
WO2022116841A1 (en) Text translation method, apparatus and device, and storage medium
WO2022228041A1 (en) Translation model training method, apparatus, and device, and storage medium
CN111008533B (en) Method, device, equipment and storage medium for obtaining translation model
WO2022127620A1 (en) Voice wake-up method and apparatus, electronic device, and storage medium
US20110125486A1 (en) Self-configuring language translation device
US11270690B2 (en) Method and apparatus for waking up device
WO2022228221A1 (en) Information translation method, apparatus and device, and storage medium
CN111382261B (en) Abstract generation method and device, electronic equipment and storage medium
CN111597825B (en) Voice translation method and device, readable medium and electronic equipment
CN113378586B (en) Speech translation method, translation model training method, device, medium, and apparatus
CN112883967B (en) Image character recognition method, device, medium and electronic equipment
CN111368560A (en) Text translation method and device, electronic equipment and storage medium
WO2023103897A1 (en) Image processing method, apparatus and device, and storage medium
WO2023082931A1 (en) Method for punctuation recovery in speech recognition, and device and storage medium
WO2022116819A1 (en) Model training method and apparatus, machine translation method and apparatus, and device and storage medium
CN112309384B (en) Voice recognition method, device, electronic equipment and medium
CN111104796A (en) Method and device for translation
CN112257459B (en) Language translation model training method, translation method, device and electronic equipment
WO2023138361A1 (en) Image processing method and apparatus, and readable storage medium and electronic device
WO2022121859A1 (en) Spoken language information processing method and apparatus, and electronic device
CN115967833A (en) Video generation method, device and equipment meter storage medium
CN111221424B (en) Method, apparatus, electronic device, and computer-readable medium for generating information

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21899855

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21899855

Country of ref document: EP

Kind code of ref document: A1