WO2022116821A1 - Procédé et appareil de traduction utilisant un modèle de traduction automatique multilingue, dispositif et support - Google Patents

Procédé et appareil de traduction utilisant un modèle de traduction automatique multilingue, dispositif et support Download PDF

Info

Publication number
WO2022116821A1
WO2022116821A1 PCT/CN2021/131090 CN2021131090W WO2022116821A1 WO 2022116821 A1 WO2022116821 A1 WO 2022116821A1 CN 2021131090 W CN2021131090 W CN 2021131090W WO 2022116821 A1 WO2022116821 A1 WO 2022116821A1
Authority
WO
WIPO (PCT)
Prior art keywords
sub
target
layer
machine translation
word embedding
Prior art date
Application number
PCT/CN2021/131090
Other languages
English (en)
Chinese (zh)
Inventor
赵程绮
朱耀明
王明轩
封江涛
李磊
Original Assignee
北京有竹居网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京有竹居网络技术有限公司 filed Critical 北京有竹居网络技术有限公司
Publication of WO2022116821A1 publication Critical patent/WO2022116821A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Definitions

  • the present disclosure relates to the field of computer technology, for example, to a translation method, apparatus, device and medium based on a multilingual machine translation model.
  • Machine Translation is one of the core tasks in natural language processing, which aims to use computer programs to translate one natural language into another natural language.
  • Traditional machine translation models are generally bilingual machine translation models, which can handle translation in one language direction, such as translating English into Chinese.
  • a large number of bilingual machine translation models need to be trained to achieve pairwise translation between each pair of natural languages.
  • the multilingual machine translation model gradually replaces the bilingual machine translation model and becomes a commonly used machine translation model. one.
  • the performance of the multilingual machine translation model is often inferior to that of the bilingual machine translation model, resulting in large translation errors in the translation results output by the multilingual machine translation model.
  • the present disclosure provides a translation method, apparatus, device and medium based on a multilingual machine translation model, so as to improve the accuracy of the translation result output by the multilingual machine translation model.
  • the present disclosure provides a translation method based on a multilingual machine translation model, including:
  • the original sentence is translated based on the multilingual machine translation model and the target adapter to obtain a target sentence.
  • the present disclosure also provides a translation device based on a multilingual machine translation model, including:
  • a statement acquisition module configured to acquire the original statement to be translated and the translation language information of the original statement
  • an adapter determination module configured to determine a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct translation errors of a preset multilingual machine translation model
  • a translation module configured to translate the original sentence based on the multilingual machine translation model and the target adapter to obtain a target sentence.
  • the present disclosure also provides an electronic device, comprising:
  • processors one or more processors
  • memory arranged to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the above-mentioned translation method based on a multilingual machine translation model.
  • the present disclosure also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the above-mentioned translation method based on a multilingual machine translation model.
  • FIG. 1 is a schematic flowchart of a translation method based on a multilingual machine translation model provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic structural diagram of an adapter according to an embodiment of the present disclosure
  • FIG. 3 is a schematic structural diagram of a multilingual machine translation model according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic diagram of a connection relationship of a target adapter according to an embodiment of the present disclosure
  • FIG. 5 is a schematic flowchart of another translation method based on a multilingual machine translation model provided by an embodiment of the present disclosure
  • FIG. 6 is a structural block diagram of a translation device based on a multilingual machine translation model provided by an embodiment of the present disclosure
  • FIG. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
  • method embodiments of the present disclosure may be performed in different orders, and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this regard.
  • the term “including” and variations thereof are open-ended inclusions, ie, "including but not limited to”.
  • the term “based on” is “based at least in part on.”
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • FIG. 1 is a schematic flowchart of a translation method based on a multilingual machine translation model according to an embodiment of the present disclosure.
  • the method may be performed by a translation apparatus based on a multilingual machine translation model, wherein the apparatus may be implemented by software and/or hardware, and may be configured in an electronic device, for example, the apparatus may be configured in a mobile phone, a tablet computer or a computer device .
  • the translation method based on the multilingual machine translation model provided by this embodiment may include:
  • the original sentence is the sentence that needs to be translated this time, which can be input by the user through an input device such as a keyboard or recognized by text recognition or voice recognition, that is, when the user needs to translate a sentence, it can be input through text or voice.
  • the input method is to input the sentence into the electronic device, and also can take a picture containing the sentence or obtain the text containing the sentence, and import the picture or text into the electronic device.
  • the translation method based on the multilingual machine translation model provided in this embodiment can translate the text or voice input by the user, and can also translate the sentences contained in the pictures or text imported by the user.
  • the sentence When translating a sentence in a voice or picture, the sentence can be converted into an original sentence in text form before translation.
  • the translation language information of the original sentence can be understood as the translation direction information during this translation, which can include the original language information of this translation (that is, the language information to which the original sentence to be translated belongs) and the target language information (that is, the original The language information to which the target sentence to which the sentence is translated belongs), the language information may be, for example, English, Chinese, or German.
  • the user when a user needs to translate an original sentence, the user inputs the original sentence, the language information of the original language to which the original sentence belongs, and the language information of the target language to which the original sentence needs to be translated into the electronic device to generate a The translation instruction for the original sentence; correspondingly, when receiving the translation instruction for the original sentence, the electronic device obtains the original sentence, and determines the translation language information of the original sentence, such as the original language selected by the user on the translation page.
  • the language information is determined as the original language information
  • the language information of the target language selected by the user on the translation page is determined as the target translation language information.
  • adapters for the multilingual machine translation model in different translation scenarios can be set, and the multilingual machine translation model can be used for a translation scenario.
  • the adapter corresponding to the translation scene is used to correct the translation error of the multilingual machine translation model, thereby improving the accuracy of the translation result output by the multilingual machine translation model.
  • the parameter amount of the adapter is very small (the parameter amount is less than one-twentieth of the multilingual machine translation model, and the larger the multilingual machine translation model is, the smaller the ratio is), therefore, by configuring the adapter, correcting more The translation error of the language machine translation model is very small, which is easy to deploy.
  • the electronic device After acquiring the translation language information of the original sentence, the electronic device can acquire the adapter corresponding to the translation language information from the preset multiple adapters according to the translation language information as the target adapter.
  • the number of adapters corresponding to a translation language information may be one or more, that is, in this embodiment, only one adapter corresponding to each translation language information may be set.
  • this adapter can be used to correct the translation error of the multilingual machine translation model; it is also possible to set up multiple adapters corresponding to each translation language information.
  • the multiple adapters can be used to correct the translation errors of the multilingual machine translation model, thereby improving the accuracy of the translation result output by the multilingual machine translation model.
  • the following is an example of this situation.
  • the adapters used can be different; the structure of the adapter can be selected flexibly.
  • An activation function may be configured between the feed layer and the second feedforward layer, and the activation function may be a Gaussian Error Linear Unit (GULU), as shown in Figure 2.
  • GUI Gaussian Error Linear Unit
  • the type of the multilingual machine translation model can be set as required, for example, the multilingual machine translation model can be a Transformer model, as shown in FIG. 3 (in FIG. 3 only one encoder and one decoder), the multilingual machine translation model may include at least one encoder and at least one decoder.
  • the multilingual machine translation model includes multiple encoders and multiple decoders, such as including 6 encoders and 6 decoders, and each encoder is provided with at least a self-attention layer and a feedforward layer Two encoder sub-layers, each decoder is provided with at least three decoder sub-layers: self-attention layer, encoding-decoding attention layer and feed-forward layer, multiple encoders are connected in series, and multiple decoders are connected in series, The feedforward layer of the last encoder is concatenated with the encoder-decoder attention layer of each decoder.
  • corresponding adapters may be set for the encoder and decoder in the language machine translation model;
  • the encoder sub-layer and the decoder sub-layer are respectively set with corresponding adapters, which are not limited in this implementation.
  • At least one encoder sub-layer in each encoder may be Divide into one or more encoder sub-layer components, divide at least one decoder sub-layer in each decoder into one or more decoder sub-layer components, and divide for each encoder sub-layer component and each decoder sub-layer component
  • the sub-level components of the device are respectively set with corresponding adapters.
  • the multilingual machine translation model includes an encoder and a decoder, the encoder includes at least one encoder sub-layer component, and the encoder sub-layer component is composed of at least one encoder sub-layer; the decoding The encoder includes at least one decoder sub-layer component, the decoder sub-layer component is composed of at least one decoder sub-layer, and each encoder sub-layer component and each decoder sub-layer component are provided with different translation language information Corresponding different adapters.
  • the values of the parameters in the adapters corresponding to different sub-layers can be different; the self-attention layer and the feed-forward layer in each encoder can separately form an encoder sub-layer component, and the self-attention layer in each decoder
  • the force layer and the encoder-decoder attention layer can together form a decoder sub-layer component, and the feedforward layer in each decoder can independently form a decoder sub-layer component, as shown in Figure 4 ( Figure 4 is only an exemplary out an encoder and a decoder).
  • the method before the acquiring the original sentence to be translated and the translated language information of the original sentence, the method further includes: for each translated language information, acquiring a plurality of training samples consistent with the translated language information , and input the plurality of training samples into the multilingual machine translation model, so as to obtain the adapter corresponding to the translated language information by training.
  • the adapters corresponding to each sub-level component (including the encoder sub-level component and the decoder sub-level component) under different translation language information can be obtained through training.
  • the parameters in the multi-language machine translation model can be fixed, and the parameters in the multi-language machine translation model can be fixed for each level of the multi-language machine translation model.
  • the layer component sets the original adapter, uses training samples consistent with the translation language information to train each original adapter, uses test samples to test the translation error of the multilingual machine translation model, and sets the translation error of the multilingual machine translation model to less than
  • the adapter corresponding to each sub-level component at the preset error threshold is determined as its corresponding adapter under the translation language information.
  • the target sentence is the sentence translated from the original sentence.
  • the data output by the multilingual machine translation model can be obtained, and the target adapter can be used to The data is corrected to obtain the target sentence;
  • the target adapter can be used to The data is corrected to obtain the target sentence;
  • the adapter corresponding to the decoder/the decoder corrects the data, inputs the corrected data into the next layer, and determines the sentence output by the multilingual machine translation model as the target sentence;
  • the adapter corresponding to the sub-level component can be used to correct the original data, and the The corrected target data is input into the next layer, and the sentence output by the multilingual machine translation model is determined as the target sentence, and the original sentence is translated
  • the translation method based on the multilingual machine translation model acquires the original sentence to be translated and the translation language information of the original sentence, and determines the multilingual information corresponding to the translation language information of the original sentence and is used to correct the preset multi-language
  • the target adapter of the translation error of the machine translation model and based on the multilingual machine translation model and the target adapter, the original sentence is translated to obtain the target sentence.
  • the accuracy of the translation result output by the multilingual translation model can be improved.
  • FIG. 5 is a schematic flowchart of another translation method based on a multilingual machine translation model provided by an embodiment of the present disclosure.
  • the solution in this embodiment may be combined with one or more optional solutions in the foregoing embodiments.
  • the multilingual machine translation model is used to translate the original sentence
  • the translation method based on the multilingual machine translation model may include:
  • the target input data can be understood as the data input into the current sub-level component.
  • the original sentence when translating the original sentence, the original sentence can be input into the multi-language machine translation model, and according to the connection relationship between the multiple layers of the multi-language machine translation model, the output of each layer to the previous layer is controlled in turn.
  • the information is processed, and when the target output data of the previous layer of the first sub-level component in the multilingual machine translation model is obtained, the first sub-level component is determined as the current sub-level component, and its previous layer is determined.
  • the target output data of is determined as the target input data of the current sub-level component.
  • the first target adapter can be understood as the adapter configured by the sub-level component and used for correcting the original output data of the sub-level component, that is, the adapter configured by the encoder sub-level component of the encoder or the decoder sub-level component of the decoder .
  • the target adapter corresponding to the translation language information of the original sentence may include the first target adapter configured by each sub-level component in the multilingual machine translation model; it may also include in the multilingual machine translation model
  • the word embedding layer (including the input word embedding layer and the output word embedding layer) is configured with the second target adapter.
  • the first target adapter corresponding to the component identification information of the current sub-level component may be selected from the plurality of target adapters determined in S202 as the current target adapter according to the component identification information of the current sub-level component.
  • S205 Input the target input data into the current sub-level component and the current target adapter, so as to obtain the original output data of the current sub-level component and the current correction parameters output by the current target adapter.
  • the original output data of the current sub-level component can be understood as the data before correction obtained by the current sub-level component operating on its target input data.
  • the current correction parameter can be understood as a parameter for correcting the original output data of the current sub-level component, which can be calculated by the current target adapter configured by the current sub-level component according to the target input data of the current sub-level component.
  • the target input data of the current sub-level component can be input into the current sub-level component and the current target adapter, and the data output by the current sub-level component can be obtained as the current sub-level component.
  • the raw output data of the sub-level component, and the data output by the current target adapter are obtained as the current calibration parameters.
  • the current correction parameters can be used to correct the original output data, for example, the original output data is corrected to the sum of the original output data and the current correction parameters, and the data obtained by correcting the original output data is determined as the current sub-layer component. target output data.
  • the connection relationship of multiple sub-level components in the multilingual machine translation model it can be judged whether the next layer connected to the output end of the current sub-level component is located in the primary level component, if the output of the current sub-level component is located in the primary level component. If the next layer connected to the terminal is located in the primary layer component, it is determined that there is a next layer component, and the sublayer component to which the next layer belongs is determined as the next layer component.
  • the location of the multiple sub-layers connected to the output terminal of the current sub-layer component can be determined according to the data flow in the multilingual machine translation model.
  • the sub-level component through which the data first flows is determined as the next-level component.
  • S208 Determine the target output data as the target input data of the next-level component, determine the next-level component as the current sub-level component, and return to executing S204.
  • S209 Input the target output data of the current sub-level component into the next layer of the current sub-level component to obtain a target sentence.
  • the electronic device can output the target output data to the multilingual machine translation model.
  • the multilingual machine translation model can output the target sentence obtained by translating the original sentence through the output layer.
  • the multilingual machine translation model further includes an input word embedding layer and an output word embedding layer, and the output of the input word embedding layer is connected to the first encoder sub-layer in the multilingual machine translation model.
  • the input end of the component is connected, and the output end of the output word embedding layer is connected to the input end of the first decoder sub-layer component in the multilingual machine translation model.
  • the multilingual machine translation model may also be provided with an input word embedding layer and an output word embedding layer, and the output end of the input word embedding layer may be the same as that of the first word embedding layer in the multilingual machine translation model.
  • the input of the self-attention layer of an encoder is connected, and the output of the output word embedding layer can be connected to the input of the self-attention layer of the first decoder in the multilingual machine translation model.
  • the input word embedding layer and the output word embedding layer of the translation model better model word semantics.
  • the translation method based on the multilingual machine translation model further comprises: when receiving the original word embedding output data of the word embedding layer, inputting the original word embedding output data into the second target adapter of the word embedding layer, and obtain the word embedding correction parameters output by the second target adapter, wherein the word embedding layer is an input word embedding layer or an output word embedding layer; using the word embedding correction parameters to correct the original word embedding output data , and obtain the target word embedding output data of the word embedding layer, so as to use the target word embedding output data as the target input data of the sub-layer components connected by the word embedding layer.
  • the second target adapter can be understood as an adapter configured by the input word embedding layer or the output word embedding layer to correct the original word embedding output data of the input word embedding layer or the output word embedding layer.
  • the original word embedding output data can be understood as the data output by the word embedding layer.
  • the electronic device may first obtain a second target adapter corresponding to the identification information of the input word embedding layer, and output the first original word embedding.
  • the data is input into the second target adapter, and the data output by the second target adapter is obtained as the word embedding correction parameter; then the first original word embedding output data is corrected using the word embedding correction parameter, and the corrected word embedding
  • the first original word embedding output data is used as the target word embedding output data of the input word embedding layer, and the target word embedding output data is input into the self-attention layer connected to the output end of the input word embedding layer.
  • the electronic device may first obtain a second target adapter corresponding to the identification information of the output word embedding layer, and input the second original word embedding output data into the output word embedding layer.
  • the second target adapter and obtain the data output by the second target adapter as the word embedding correction parameter; then use the word embedding correction parameter to correct the second original word embedding output data, and the corrected second original word
  • the embedding output data is used as the target word embedding output data of the output word embedding layer, and the target word embedding output data is input into the self-attention layer connected to the output end of the output word embedding layer.
  • adapters corresponding to different translation language information are set for each encoder component and each decoder component in the multilingual machine translation model, and the original sentence When translating, the adapters set by each encoder component and each decoder component are used to correct the data output by the corresponding encoder or decoder, which can improve the multilingual machine translation model on the premise of adding fewer parameters. translation accuracy.
  • FIG. 6 is a structural block diagram of a translation apparatus based on a multilingual machine translation model according to an embodiment of the present disclosure.
  • the apparatus can be implemented by software and/or hardware, and can be configured in electronic equipment.
  • the apparatus can be configured in mobile phones, tablet computers, or computer equipment, and can perform sentence translation by executing a translation method based on a multilingual machine translation model.
  • the translation apparatus based on a multilingual machine translation model may include: a sentence acquisition module 601, an adapter determination module 602, and a translation module 603, wherein the sentence acquisition module 601 is configured to acquire the to-be-translated The original sentence and the translated language information of the original sentence; the adapter determination module 602 is configured to determine a target adapter corresponding to the translated language information of the original sentence, wherein the target adapter is used to correct a preset multilingual machine translation The translation error of the model; the translation module 603 is configured to translate the original sentence based on the multilingual machine translation model and the target adapter to obtain the target sentence.
  • the original sentence to be translated and the translated language information of the original sentence are obtained through the sentence acquisition module, and the translation language information corresponding to the translated language information of the original sentence is determined through the adapter determination module.
  • the target adapter is used to correct the translation error of the preset multilingual machine translation model, and the original sentence is translated by the translation module based on the multilingual machine translation model and the target adapter to obtain the target sentence.
  • the accuracy of the translation result output by the multilingual translation model can be improved.
  • the multilingual machine translation model includes an encoder and a decoder
  • the encoder includes at least one encoder sub-layer component, and the encoder sub-layer component is composed of at least one encoder sub-layer
  • the The decoder includes at least one decoder sub-layer component, the decoder sub-layer component is composed of at least one decoder sub-layer, and each encoder sub-layer component and each decoder sub-layer component are provided with different translation languages.
  • Information corresponds to different adapters.
  • the translation module 603 is configured to: use the multilingual machine translation model to translate the original sentence, and use the first target adapter of each sub-level component to translate the original sentence of each sub-level component.
  • the output value is corrected to obtain the target sentence, and the sub-level components include an encoder sub-level component and/or a decoder sub-level component.
  • the translation module 603 includes: a component determination unit, configured to determine the first sub-level component in the multi-language machine translation model according to the connection relationship of multiple sub-level components in the multi-language machine translation model. is the current sub-level component, and acquires the target input data of the current sub-level component; the adapter acquisition unit is set to determine the first target adapter of the current sub-level component as the current target adapter; the parameter determination unit is set to The target input data is input into the current sub-level component and the current target adapter, so as to obtain the original output data of the current sub-level component and the current correction parameters output by the current target adapter; the correction unit is set to The original output data is corrected by using the current correction parameters to obtain the target output data of the current sub-level component; the calling unit is configured to determine the target output data as the target input data of the next-level component, and Determine the next-level component as the current-level component, and return to call the adapter acquisition unit until there is no next-level component; the input unit is set to be the current current
  • the multilingual machine translation model further includes an input word embedding layer and an output word embedding layer, the output end of the input word embedding layer and the first encoder sub-layer component in the multilingual machine translation model
  • the input end of the output word embedding layer is connected to the input end of the first decoder sub-layer component in the multilingual machine translation model.
  • the translation device based on the multi-language machine translation model further includes: an adapter input module, configured to input the original word embedding output data when receiving the original word embedding output data of the word embedding layer. into the second target adapter of the word embedding layer, and obtain the word embedding correction parameters output by the second target adapter, wherein the word embedding layer is the input word embedding layer or the output word embedding layer; the embedding layer correction module , set to use the word embedding correction parameters to correct the original word embedding output data to obtain the target word embedding output data of the word embedding layer, so as to connect the target word embedding output data as the word embedding layer The target input data of the sub-level component.
  • an adapter input module configured to input the original word embedding output data when receiving the original word embedding output data of the word embedding layer. into the second target adapter of the word embedding layer, and obtain the word embedding correction parameters output by the second target adapter, wherein the
  • the translation apparatus based on the multilingual machine translation model further includes: an adapter training module, configured to, before the acquisition of the original sentence to be translated and the translation language information of the original sentence, for each A translation language information is obtained, a plurality of training samples consistent with the translated language information are obtained, and the multiple training samples are input into a multilingual machine translation model, so as to obtain an adapter corresponding to the translated language information by training.
  • an adapter training module configured to, before the acquisition of the original sentence to be translated and the translation language information of the original sentence, for each A translation language information is obtained, a plurality of training samples consistent with the translated language information are obtained, and the multiple training samples are input into a multilingual machine translation model, so as to obtain an adapter corresponding to the translated language information by training.
  • the translation apparatus based on the multilingual machine translation model provided by the embodiment of the present disclosure can execute the translation method based on the multilingual machine translation model provided by any embodiment of the present disclosure, and has corresponding functional modules for executing the translation method based on the multilingual machine translation model and Effect.
  • the translation method based on a multilingual machine translation model provided by any embodiment of the present disclosure can execute the translation method based on the multilingual machine translation model provided by any embodiment of the present disclosure, and has corresponding functional modules for executing the translation method based on the multilingual machine translation model and Effect.
  • FIG. 7 it shows a schematic structural diagram of an electronic device (eg, a terminal device) 700 suitable for implementing an embodiment of the present disclosure.
  • Terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistants, PDAs), tablet computers (PADs), and portable multimedia players (Portable Media Players). , PMP), in-vehicle terminals (eg, in-vehicle navigation terminals), etc., and stationary terminals such as digital (Television, TV), desktop computers, and the like.
  • PMP Personal Digital Assistants
  • PDAs Personal Digital Assistants
  • PADs tablet computers
  • portable multimedia players Portable Media Players
  • PMP Personal Digital Assistants
  • in-vehicle terminals eg, in-vehicle navigation terminals
  • stationary terminals such as digital (Television, TV), desktop computers, and the like.
  • the electronic device shown in FIG. 7 is only an example, and
  • the electronic device 700 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 701, which may be based on a program stored in a read-only memory (Read-Only Memory, ROM) 702 or from a storage device 708 programs loaded into Random Access Memory (RAM) 703 to perform various appropriate actions and processes.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • various programs and data required for the operation of the electronic device 700 are also stored.
  • the processing device 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704.
  • An Input/Output (I/O) interface 705 is also connected to the bus 704 .
  • I/O interface 705 the following devices can be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a Liquid Crystal Display (LCD) Output device 707 , speaker, vibrator, etc.; storage device 708 , including, for example, magnetic tape, hard disk, etc.; and communication device 709 .
  • Communication means 709 may allow electronic device 700 to communicate wirelessly or by wire with other devices to exchange data.
  • FIG. 7 shows an electronic device 700 having various means, it is not required to implement or have all of the illustrated means. More or fewer devices may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 709, or from the storage device 708, or from the ROM 702.
  • the processing device 701 the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the computer-readable medium described above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above.
  • Examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, RAM, ROM, Erasable Programmable Read-Only Memory (EPROM) or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • the program code embodied on the computer readable medium can be transmitted by any suitable medium, including but not limited to: electric wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the above.
  • clients and servers can communicate using any currently known or future developed network protocols, such as HyperText Transfer Protocol (HTTP), and can communicate with digital data in any form or medium.
  • Communication eg, a communication network
  • Examples of communication networks include Local Area Networks (LANs), Wide Area Networks (WANs), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently Known or future developed networks.
  • LANs Local Area Networks
  • WANs Wide Area Networks
  • the Internet eg, the Internet
  • peer-to-peer networks eg, ad hoc peer-to-peer networks
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: obtains the original sentence to be translated and the translation language information of the original sentence; The target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used to correct the translation error of a preset multilingual machine translation model; based on the multilingual machine translation model and the target adapter, the The original sentence is translated to obtain the target sentence.
  • Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a LAN or WAN, or, alternatively, may be connected to an external computer (eg, using an Internet service provider to connect through the Internet).
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner.
  • the name of the module does not constitute a limitation on the unit itself.
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (Application Specific Standard Products) Standard Parts, ASSP), system on chip (System on Chip, SOC), complex programmable logic device (Complex Programmable Logic Device, CPLD) and so on.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSP Application Specific Standard Products
  • SOC System on Chip
  • complex programmable logic device Complex Programmable Logic Device, CPLD
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing. Examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, RAM, ROM, EPROM or flash memory, optical fibers, CD-ROMs, optical storage devices, magnetic storage devices, or Any suitable combination of the above.
  • Example 1 provides a translation method based on a multilingual machine translation model, including:
  • Example 2 According to the method of Example 1, the multilingual machine translation model includes an encoder and a decoder, and the encoder includes at least one encoder sub-level component, so The encoder sub-layer component is composed of at least one encoder sub-layer; the decoder includes at least one decoder sub-layer component, and the decoder sub-layer component is composed of at least one decoder sub-layer, each encoder sub-layer
  • the layer component and each decoder sub-layer component are provided with different adapters corresponding to different translated language information.
  • Example 3 the translation of the original sentence based on the multilingual machine translation model and the target adapter to obtain the target sentence includes:
  • Each of the sub-layer components includes an encoder sub-layer component and/or a decoder sub-layer component.
  • Example 4 According to the method of Example 3, the original sentence is translated using the multilingual machine translation model, and the first target of each sub-level component is adopted
  • the adapter corrects the raw output value of each sub-level component to obtain the target statement, including:
  • the first sub-level component in the multi-language machine translation model as the current sub-level component, and obtain the current sub-level component.
  • target input data determine the first target adapter of the current sub-level component as the current target adapter; input the target input data into the current sub-level component and the current target adapter to obtain the current sub-level component
  • the original output data of the layer component and the current correction parameters output by the current target adapter use the current correction parameters to correct the original output data to obtain the target output data of the current sub-layer component; output the target
  • the data is determined as the target input data of the next-level component, and the next-level component is determined as the current sub-level component, and the operation of determining the first target adapter of the current sub-level component as the current target adapter is returned to be executed , until there is no next-level component; when there is no next-level component, input the target output data of the current sub-level component into the next layer of the current sub-level component to
  • Example 5 According to the method of Example 3 or 4, the multilingual machine translation model further includes an input word embedding layer and an output word embedding layer, the output of the input word embedding layer The terminal is connected to the input terminal of the first encoder sub-layer component in the multilingual machine translation model, and the output terminal of the output word embedding layer is connected to the input terminal of the first decoder sub-layer component in the multilingual machine translation model. connected to the input.
  • Example 6 is the method of Example 5, further comprising:
  • the original word embedding output data of the word embedding layer When receiving the original word embedding output data of the word embedding layer, input the original word embedding output data into the second target adapter of the word embedding layer, and obtain the word embedding correction parameters output by the second target adapter , wherein the word embedding layer is the input word embedding layer or the output word embedding layer; the original word embedding output data is corrected by using the word embedding correction parameter to obtain the target word of the word embedding layer Embedding output data to use the target word embedding output data as target input data for sub-layer components connected by the word embedding layer.
  • Example 7 according to the method described in any one of Examples 1-4, before the acquiring the original sentence to be translated and the translation language information of the original sentence, further includes:
  • For each kind of translated language information obtain multiple training samples that match the translated language information, and input the multiple training samples into the multilingual machine translation model, so as to obtain training samples that match the translation language information.
  • the adapter corresponding to the translated language information.
  • Example 8 provides a translation apparatus based on a multilingual machine translation model, including:
  • a statement acquisition module configured to acquire the original statement to be translated and the translation language information of the original statement; an adapter determination module, configured to determine a target adapter corresponding to the translated language information of the original statement, wherein the target adapter uses for correcting the translation error of the preset multilingual machine translation model; the translation module is configured to translate the original sentence based on the multilingual machine translation model and the target adapter to obtain the target sentence.
  • Example 9 provides an electronic device, comprising:
  • Example 1 The translation method based on a multilingual machine translation model described in any one of -7.
  • Example 10 provides a computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements the method based on any of Examples 1-7. Translation methods for multilingual machine translation models.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

L'invention concerne un procédé et un appareil de traduction utilisant un modèle de traduction automatique multilingue, et un dispositif et un support. Le procédé de traduction utilisant un modèle de traduction automatique multilingue comprend les étapes consistant à : acquérir une expression d'origine à traduire et des informations de langue de traduction de l'expression d'origine (S101) ; déterminer un adaptateur cible correspondant aux informations de langue de traduction de l'expression d'origine, l'adaptateur cible étant utilisé pour corriger des erreurs de traduction d'un modèle de traduction automatique multilingue prédéfini (S102) ; et traduire l'expression d'origine sur la base du modèle de traduction automatique multilingue et de l'adaptateur cible, et obtenir une expression cible (S103).
PCT/CN2021/131090 2020-12-04 2021-11-17 Procédé et appareil de traduction utilisant un modèle de traduction automatique multilingue, dispositif et support WO2022116821A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011409340.1A CN112380876A (zh) 2020-12-04 2020-12-04 基于多语言机器翻译模型的翻译方法、装置、设备和介质
CN202011409340.1 2020-12-04

Publications (1)

Publication Number Publication Date
WO2022116821A1 true WO2022116821A1 (fr) 2022-06-09

Family

ID=74590507

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/131090 WO2022116821A1 (fr) 2020-12-04 2021-11-17 Procédé et appareil de traduction utilisant un modèle de traduction automatique multilingue, dispositif et support

Country Status (2)

Country Link
CN (1) CN112380876A (fr)
WO (1) WO2022116821A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115438678A (zh) * 2022-11-08 2022-12-06 苏州浪潮智能科技有限公司 机器翻译方法、装置、电子设备及存储介质
CN115688815A (zh) * 2022-12-30 2023-02-03 北京澜舟科技有限公司 多语言翻译模型构建方法及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380876A (zh) * 2020-12-04 2021-02-19 北京有竹居网络技术有限公司 基于多语言机器翻译模型的翻译方法、装置、设备和介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543824A (zh) * 2018-11-30 2019-03-29 腾讯科技(深圳)有限公司 一种序列模型的处理方法和装置
US20200034436A1 (en) * 2018-07-26 2020-01-30 Google Llc Machine translation using neural network models
CN110852116A (zh) * 2019-11-07 2020-02-28 腾讯科技(深圳)有限公司 非自回归神经机器翻译方法、装置、计算机设备和介质
CN111178093A (zh) * 2019-12-20 2020-05-19 沈阳雅译网络技术有限公司 一种基于堆叠算法的神经机器翻译系统训练加速方法
CN111222347A (zh) * 2020-04-15 2020-06-02 北京金山数字娱乐科技有限公司 语句翻译模型的训练方法及装置、语句翻译方法及装置
CN111814493A (zh) * 2020-04-21 2020-10-23 北京嘀嘀无限科技发展有限公司 机器翻译方法、装置、电子设备和存储介质
CN111859927A (zh) * 2020-06-01 2020-10-30 北京先声智能科技有限公司 一种基于注意力共享Transformer的语法改错模型
CN112380876A (zh) * 2020-12-04 2021-02-19 北京有竹居网络技术有限公司 基于多语言机器翻译模型的翻译方法、装置、设备和介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7039625B2 (en) * 2002-11-22 2006-05-02 International Business Machines Corporation International information search and delivery system providing search results personalized to a particular natural language
US8880770B2 (en) * 2012-06-07 2014-11-04 Apple Inc. Protocol translating adapter
US10909331B2 (en) * 2018-03-30 2021-02-02 Apple Inc. Implicit identification of translation payload with neural machine translation
CN111460838B (zh) * 2020-04-23 2023-09-22 腾讯科技(深圳)有限公司 智能翻译模型的预训练方法、装置和存储介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200034436A1 (en) * 2018-07-26 2020-01-30 Google Llc Machine translation using neural network models
CN109543824A (zh) * 2018-11-30 2019-03-29 腾讯科技(深圳)有限公司 一种序列模型的处理方法和装置
CN110852116A (zh) * 2019-11-07 2020-02-28 腾讯科技(深圳)有限公司 非自回归神经机器翻译方法、装置、计算机设备和介质
CN111178093A (zh) * 2019-12-20 2020-05-19 沈阳雅译网络技术有限公司 一种基于堆叠算法的神经机器翻译系统训练加速方法
CN111222347A (zh) * 2020-04-15 2020-06-02 北京金山数字娱乐科技有限公司 语句翻译模型的训练方法及装置、语句翻译方法及装置
CN111814493A (zh) * 2020-04-21 2020-10-23 北京嘀嘀无限科技发展有限公司 机器翻译方法、装置、电子设备和存储介质
CN111859927A (zh) * 2020-06-01 2020-10-30 北京先声智能科技有限公司 一种基于注意力共享Transformer的语法改错模型
CN112380876A (zh) * 2020-12-04 2021-02-19 北京有竹居网络技术有限公司 基于多语言机器翻译模型的翻译方法、装置、设备和介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
NEIL HOULSBY; ANDREI GIURGIU; STANISLAW JASTRZEBSKI; BRUNA MORRONE; QUENTIN DE LAROUSSILHE; ANDREA GESMUNDO; MONA ATTARIYAN; SYLVA: "Parameter-Efficient Transfer Learning for NLP", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 2 February 2019 (2019-02-02), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081024612 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115438678A (zh) * 2022-11-08 2022-12-06 苏州浪潮智能科技有限公司 机器翻译方法、装置、电子设备及存储介质
CN115688815A (zh) * 2022-12-30 2023-02-03 北京澜舟科技有限公司 多语言翻译模型构建方法及存储介质
CN115688815B (zh) * 2022-12-30 2023-03-31 北京澜舟科技有限公司 多语言翻译模型构建方法及存储介质

Also Published As

Publication number Publication date
CN112380876A (zh) 2021-02-19

Similar Documents

Publication Publication Date Title
WO2022116821A1 (fr) Procédé et appareil de traduction utilisant un modèle de traduction automatique multilingue, dispositif et support
CN111046677B (zh) 一种翻译模型的获取方法、装置、设备和存储介质
WO2022116841A1 (fr) Procédé, appareil et dispositif de traduction de texte, et support de stockage
WO2022228041A1 (fr) Procédé, appareil et dispositif d'entraînement de modèle de traduction, et support de stockage
CN111008533B (zh) 一种翻译模型的获取方法、装置、设备和存储介质
US20110125486A1 (en) Self-configuring language translation device
WO2022127620A1 (fr) Procédé et appareil de réveil vocal, dispositif électronique et support de stockage
WO2022228221A1 (fr) Procédé, appareil et dispositif de traduction d'informations et support de stockage
CN111382261B (zh) 摘要生成方法、装置、电子设备及存储介质
US11270690B2 (en) Method and apparatus for waking up device
CN111597825B (zh) 语音翻译方法、装置、可读介质及电子设备
CN113378586B (zh) 语音翻译方法、翻译模型训练方法、装置、介质及设备
CN112883967B (zh) 图像字符识别方法、装置、介质及电子设备
CN111368560A (zh) 文本翻译方法、装置、电子设备及存储介质
WO2023103897A1 (fr) Procédé, appareil et dispositif de traitement d'images et support de stockage
WO2024099342A1 (fr) Procédé et appareil de traduction, support lisible et dispositif électronique
WO2023082931A1 (fr) Procédé de récupération de ponctuation dans la reconnaissance de la parole, et dispositif et support d'enregistrement
WO2022116819A1 (fr) Procédé et appareil d'entraînement de modèle, procédé et appareil de traduction automatique, dispositif, et support de stockage
CN112309384B (zh) 一种语音识别方法、装置、电子设备及介质
CN111104796A (zh) 用于翻译的方法和装置
WO2023138361A1 (fr) Procédé et appareil de traitement d'image, support de stockage lisible et dispositif électronique
WO2022121859A1 (fr) Procédé et appareil de traitement d'informations en une langue parlée et dispositif électronique
CN112257459B (zh) 语言翻译模型的训练方法、翻译方法、装置和电子设备
CN115967833A (zh) 视频生成方法、装置、设备计存储介质
CN111221424B (zh) 用于生成信息的方法、装置、电子设备和计算机可读介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21899855

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21899855

Country of ref document: EP

Kind code of ref document: A1