CN113283249A

CN113283249A - Machine translation method, device and computer readable storage medium

Info

Publication number: CN113283249A
Application number: CN202010102636.2A
Authority: CN
Inventors: 罗维; 陈博兴; 黄非
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-02-19
Filing date: 2020-02-19
Publication date: 2021-08-20

Abstract

The invention provides a machine translation method, a machine translation device and a computer readable storage medium. The method comprises the following steps: obtaining a sentence to be translated; inputting the sentence to be translated into a machine translation model obtained by pre-training, and calculating to obtain a translated target sentence; wherein the machine translation model is obtained by: obtaining a training sample; each training sample comprises a source language and a target language; generating a countermeasure sample for the source language in the training sample; and training according to the confrontation sample to obtain a machine translation model. According to the method disclosed by the invention, the robustness of a machine translation model can be improved.

Description

Machine translation method, device and computer readable storage medium

Technical Field

The present invention relates to the field of neural network technologies, and in particular, to a machine translation method, a machine translation apparatus, and a computer-readable storage medium.

Background

Machine translation is the process of converting one natural language (source language) to another natural language (target language) using a computer. With the rapid development of the economic globalization and the internet, the machine translation has important practical value and plays an increasingly important role in promoting political, economic and cultural communication and the like.

A commonly used Machine Translation model is Neural Network Machine Translation (NMT) based on Neural network technology modeling. Because the spoken text is an informal text and has the characteristics of harmonious sound/sound closeness, omission/omission, inversion, errors and the like, the NMT is easy to generate errors when translating the spoken text.

Therefore, the inventor believes that there is a need for improvements to the machine translation model to improve the robustness of the machine translation model.

Disclosure of Invention

An object of the embodiments of the present invention is to provide a new technical solution for machine translation.

According to a first aspect of the embodiments of the present invention, there is provided a machine translation method, including:

obtaining a sentence to be translated;

inputting the sentence to be translated into a machine translation model obtained by pre-training, and calculating to obtain a translated target sentence;

wherein the machine translation model is obtained by:

obtaining a training sample; each training sample comprises a source language and a target language;

generating a countermeasure sample for the source language in the training sample;

and training according to the confrontation sample to obtain a machine translation model.

Optionally, the obtaining training samples includes:

the source language is obtained and the source language is obtained,

determining word segmentation information and pinyin information of the source language;

mapping the word segmentation information and the pinyin information respectively, mapping the word segmentation information into the word segmentation characteristic vectors, and mapping the pinyin information into the pinyin characteristic vectors; and forming the training sample by the word segmentation feature vector and the pinyin feature vector.

Optionally, the generating a confrontation sample for the source language in the training sample comprises:

randomly sampling the source language and determining the position of the replaced word;

determining a candidate word set;

and selecting a replacement word from the candidate word set according to the gradient direction, adding the replacement word to the position of the replaced word, and generating the confrontation sample.

Optionally, the forming the training sample by the word segmentation feature vector and the pinyin feature vector includes:

and performing fusion calculation on the word segmentation feature vector and the pinyin feature vector, and calculating to obtain the training sample.

Optionally, the training according to the confrontation sample to obtain a machine translation model includes:

and training a loss function according to the countermeasure sample and the target language in the training sample to obtain the machine translation model.

Optionally, the loss function is a cumulative sum of the source language loss term, the participle information loss term, and the pinyin information loss term.

According to a second aspect of the present invention, there is also provided a machine translation apparatus, the apparatus comprising:

the acquisition module is used for acquiring the statement to be translated;

the calculation module is used for inputting the sentence to be translated into a machine translation model obtained by pre-training and calculating to obtain a translated target sentence;

wherein the machine translation model is trained by a training module; the training module comprises:

an acquisition unit for acquiring a training sample; each training sample comprises a source language and a target language;

a generating unit, configured to generate a countermeasure sample for a source language in the training sample;

and the training unit is used for training according to the confrontation sample to obtain a machine translation model.

Optionally, the generating unit is specifically configured to:

randomly sampling the source language and determining the position of the replaced word; determining a candidate word set; and selecting a replacement word from the candidate word set according to the gradient direction, adding the replacement word to the position of the replaced word, and generating the confrontation sample.

According to a third aspect of the present invention, there is also provided a machine translation apparatus, the apparatus comprising:

a memory and a processor; the memory is for storing executable instructions and the processor is for performing operations in the machine translation method according to any one of the first aspects of the invention under control of the instructions.

According to a fourth aspect of the present invention, there is also provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the machine translation method according to any one of the first aspects of the present invention.

According to one embodiment of the invention, pinyin information is introduced, and disturbance information is added to the input of the source language and the target language, so that the modeling capability of an encoder in a machine translation model when processing harmonic/near errors is improved, the robustness of the machine translation model is improved, and the machine translation model can better process noise input texts.

Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention.

Fig. 1 is a block diagram showing a hardware configuration of an electronic device that can implement an embodiment of the present invention.

FIG. 2 shows a schematic flow diagram of a machine translation method of an embodiment of the present invention;

FIG. 3 is a schematic flow diagram of training a machine translation model in a machine translation method of an embodiment of the present invention;

FIG. 4 is a diagram illustrating a machine translation method fusing Pinyin information to a source language according to an embodiment of the present invention;

FIG. 5 is a diagram illustrating an encoder input perturbation by a machine translation method according to an embodiment of the present invention;

FIG. 6 is a diagram illustrating perturbation of a decoder input by a machine translation method according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of a method of machine translation according to an embodiment of the present invention;

FIG. 8 is a diagram illustrating a machine translation device according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of a machine translation apparatus according to a second embodiment of the present invention.

Detailed Description

Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

Various embodiments and examples according to embodiments of the present invention are described below with reference to the accompanying drawings.

< hardware configuration >

The electronic device 1000 may be a computer, terminal device, or the like.

In one example, the electronic device 1000 may include a processor 1100, a memory 1200, an interface device 1300, a communication device 1400, a display device 1500, an input device 1600, an output device 1700, and a camera device 1800, as shown in fig. 1. Although the electronic device 1000 may also include a speaker, a microphone, and the like, these components are not relevant to the present invention and are omitted here.

The processor 1100 may be, for example, a central processing unit CPU, a microprocessor MCU, or the like. The memory 1200 includes, for example, a ROM (read only memory), a RAM (random access memory), a nonvolatile memory such as a hard disk, and the like. The interface device 1300 includes, for example, a USB interface, a serial interface, and the like. Communication device 1400 is capable of wired or wireless communication, for example. The display device 1500 is, for example, a liquid crystal display panel. The input device 1600 may include, for example, a touch screen, a keyboard, and the like.

The electronic device shown in fig. 1 is merely illustrative and is in no way meant to limit the invention, its application, or uses. In an embodiment of the present invention, the memory 1200 of the electronic device 1000 is used for storing instructions for controlling the processor 1100 to operate so as to execute any one of the machine translation methods provided by the embodiment of the present invention.

It will be appreciated by those skilled in the art that although a plurality of means are shown for the electronic device 1000 in fig. 1, the present invention may relate to only some of the means therein, for example, only the processor 1100 and the memory means 1200 of the electronic device 1000.

The skilled person can design the instructions according to the disclosed solution. How the instructions control the operation of the processor is well known in the art and will not be described in detail herein.

< method examples >

FIG. 2 shows a schematic flow diagram of a machine translation method of an embodiment of the present invention.

The machine translation method of the present embodiment may be specifically executed by the electronic device 1000 shown in fig. 1.

As shown in fig. 2, in step 2100, a statement to be translated is obtained.

Step 2200, inputting the sentence to be translated into a machine translation model obtained by pre-training, and calculating to obtain a translated target sentence.

In this step, the machine translation model is used to translate the sentence to be translated, and an output translated target sentence is obtained.

Specifically, as shown in fig. 3, the machine translation model is obtained as follows:

3100, obtaining a training sample; wherein each training sample comprises a source language and a target language.

In this step, pinyin information is added to the source language in the training sample to address the harmonic/near-to-speech problem. Specifically, the electronic device 1000 obtains the source language, and determines word segmentation information and pinyin information of the source language; mapping the word segmentation information and the pinyin information respectively, mapping the word segmentation information into the word segmentation characteristic vectors, and mapping the pinyin information into the pinyin characteristic vectors; and forming the training sample by the word segmentation feature vector and the pinyin feature vector.

The electronic device 1000 may perform fusion calculation on the word segmentation feature vector and the pinyin feature vector to obtain the training sample. Optionally, the segmentation feature vector and the pinyin feature vector may be subjected to fusion calculation in a linear weighting manner, or the segmentation feature vector and the pinyin feature vector may be subjected to fusion calculation in a nonlinear transformation manner. This embodiment is not particularly limited thereto.

As shown in FIG. 4, the obtained source language is "without any agenda voting against him", and the corresponding pinyin information is "mei you ren he yi ya tou pao fa dui ta". The electronic device 1000 first maps (embedding) the pinyin information, and then fuses the word segmentation information and the pinyin information through a linear weighting or nonlinear transformation and other fusion methods to obtain a training sample, for example, the training sample may be obtainedRepresenting the source language as E (w) ═ fusion (E)_Word(w)，avg_piew(E_phonetic(pi)))。

Step 3200, generating countermeasure samples for the source languages in the training samples.

Specifically, the electronic device 1000 may perform random sampling on the source language to determine the position of the replaced word; determining a candidate word set; and selecting a replacement word from the candidate word set according to the gradient direction, adding the replacement word to the position of the replaced word, and generating the confrontation sample.

In practical application, the main stream NMT model is an encoder-decoder framework, and a bridge is established between an encoder and a decoder through attention. This attribute score is calculated here for natural accounting between encoder and decoder. The encoder side processes here based on uniformly distributed sampling, for example, if the input is a sentence of 10 words, then the probability that each position is sampled is 1/10, and N positions can be sampled according to the distribution and perturbed. However, for decoder, disturbance added to the encoder input will have an effect on the operation of decoder. It is desirable to sample not just with a uniform distribution, so an improved strategy is devised to establish a more efficient distribution by focusing on the attention of the encoderperturbed word, and then to sample the perturbed location based on this distribution.

Specifically, as shown in fig. 5, the disturbance to the encoder input mainly includes: sample positions of replaced words (sample positions): the random sampling is based on a uniform distribution or on the distribution of intensities of the modes of the gradient vector. For example, fig. 5 selects the 2 nd and 6 th position words (respectively, "any" and "against") to be replaced, followed by the selection of the replacement words.

Sampling candidate sets (sample candidate sets) of replacement words has two ways: the first way is to sample all the word lists to obtain replacement words, and the calculation cost for selecting the replacement words from all the word lists is very high. For example, if the vocabulary size is 5w vocabulary, 5w operations are required at each position to be replaced, and the time complexity is O (| V |), which results in a huge training cost. The second way is to obtain a candidate set of replacement words from the whole vocabulary through a Language Model (LM) or random sampling, i.e., to select n candidate words (n < < | V |) as a candidate set.

In fig. 5, W1, W2, the score behind Wn is a probability distribution about the vocabulary, which can be regarded as word weight, and in this embodiment, the electronic device 1000 samples based on the probability distribution. Specifically, one sampling method is uniformly distributed Random sampling (Random), and the other sampling method is context-based language model calculation (LM). And is not particularly limited herein.

After obtaining the candidate set of replacement words, the electronic device 1000 selects a word (gradient-based selection) closest to the gradient direction from the candidate set of replacement words: unlike images, where pixels are continuous, but text is discontinuous, the gradient direction of the embedding vector corresponding to the loss function with respect to the sampling position does not necessarily correspond to the word in the candidate set (even the entire vocabulary), so it is necessary to find the word closest to the gradient direction. That is, calculating the gradient for the word vector corresponding to the word at the sampled position, such as "any" and "anti-any" in FIG. 5, requires calculating the gradient for the embedding vectors of the two words, for example, by formula

Wherein the content of the first and second substances,

find "which" and "negate" to replace "any" and "objection" in the input, respectively.

The e (x) function represents an embedding vector corresponding to a word x, wherein x is any element in a word set Vx, and the Vx can be the whole word list space or a word list candidate set after pruning. e (x)_i) The expression x_iCorresponding embedding vector, here x_iIs intended to refer specifically to the word at position i. g \u_xiThe gradient of the imbedding of the penalty term with respect to the word at the ith position is indicated. sim (a, b) means that a and b are similarDegree, where a and b are both embedding vectors.

For example, in FIG. 5, the original input to encoder is "No Agents vote against him", and after adding a perturbation, the input becomes "No Agents vote against him". It should be noted that this is an ideal result after perturbation, and in practice, slight changes in sentence semantics may be encountered after perturbation is applied, but this may be expected to improve model robustness, and in some data sets, translation quality may also be improved.

Further, considering that the robustness of the machine translation model cannot be sufficiently improved by only disturbing the encoder input, the decoder input also needs to be disturbed, and the specific process can be as shown in fig. 6, the algorithm process is substantially the same as the disturbance of the encoder input, and the only difference is that the position of the replaced word is sampled, and here, the position of the replaced word is sampled based on the distribution corresponding to the attention score of the disturbed position of the encoder is considered.

After obtaining the challenge sample, perform step 3300:

at step 3300, training is performed based on the challenge samples, resulting in a machine translation model.

And training a loss function according to the countermeasure sample and the target language in the training sample to obtain the machine translation model. As shown in fig. 7, the left part of the graph uses the methods described in fig. 4, fig. 5, and fig. 6, respectively, and the right part of the graph is an improvement of the loss function of this embodiment, and the loss function is composed of three loss terms, i.e., a source language loss term, a participle information loss term, and a pinyin information loss term. And after the loss value is obtained through calculation, updating the parameters of the machine translation model according to the loss value until the parameters are converged.

According to the technical scheme of the embodiment, the obtained sentence to be translated can be input into a machine translation model obtained by pre-training, and a translated target sentence is obtained through calculation; the machine translation model is obtained by obtaining a training sample, generating a confrontation sample for a source language in the training sample and then training according to the confrontation sample. According to the method of the embodiment, the robustness of the machine translation model can be improved. And by introducing pinyin information and adding disturbance information to the input of the source language and the target language, the modeling capability of an encoder in the machine translation model when processing harmonic/near-error is improved, the robustness of the machine translation model is improved, and the machine translation model can better process noise input texts.

< apparatus embodiment >

Fig. 8 is a schematic structural diagram of a machine translation apparatus according to an embodiment of the present invention.

As shown in fig. 8, the machine translation apparatus 8000 of the present embodiment may include: an acquisition module 8100 and a calculation module 8200.

The obtaining module 8100 is configured to obtain a statement to be translated;

and the calculating module 8200 is used for inputting the sentence to be translated into a machine translation model obtained by pre-training and calculating to obtain a translated target sentence.

The machine translation model is trained by the training module 8300, and specifically, the training module 8300 may include: an acquisition unit 8301, a generation unit 8302, and a training unit 8303. The obtaining unit 8301 is used for obtaining training samples; each training sample comprises a source language and a target language; a generating unit 8302 is used for generating a confrontation sample for the source language in the training sample; the training unit 8303 is configured to train according to the confrontation sample to obtain a machine translation model.

Specifically, the obtaining unit 8301 may be configured to obtain the source language, and determine word segmentation information and pinyin information of the source language; mapping the word segmentation information and the pinyin information respectively, mapping the word segmentation information into the word segmentation characteristic vectors, and mapping the pinyin information into the pinyin characteristic vectors; and forming the training sample by the word segmentation feature vector and the pinyin feature vector. The obtaining unit 8301 may be specifically configured to perform fusion calculation on the word segmentation feature vector and the pinyin feature vector to obtain the training sample.

Specifically, the generating unit 8302 is specifically configured to randomly sample the source language and determine the position of the replaced word; determining a candidate word set; and selecting a replacement word from the candidate word set according to the gradient direction, adding the replacement word to the position of the replaced word, and generating the confrontation sample.

Specifically, the training unit 8303 is specifically configured to train a loss function according to the countermeasure sample and the target language in the training sample to obtain the machine translation model. Wherein the loss function is the accumulated sum of the source language loss item, the participle information loss item and the pinyin information loss item.

As shown in fig. 9, in this embodiment, the machine translation device 9000 can comprise a memory 9100 and a processor 9200; the memory 9100 is used for storing executable instructions, and the processor 9200 is used for executing the operations in the machine translation method provided by any one of the above embodiments according to the control of the instructions.

Those skilled in the art will appreciate that the machine translation apparatus may be implemented in a variety of ways. For example, a machine translation apparatus may be implemented by an instruction configuration processor. For example, the machine translation apparatus may be implemented by storing instructions in ROM and reading the instructions from ROM into a programmable device when the device is started. For example, the machine translation device may be solidified into a dedicated device (e.g., ASIC). The machine translation apparatus may be divided into separate units or they may be combined together for implementation. The machine translation means may be implemented in one of the various implementations described above, or may be implemented in a combination of two or more of the various implementations described above.

< computer-readable storage Medium >

In this embodiment, there is also provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a machine translation method according to any of the embodiments of the present invention.

It is well known to those skilled in the art that with the development of electronic information technology such as large scale integrated circuit technology and the trend of software hardware, it has been difficult to clearly divide the software and hardware boundaries of a computer system. As any of the operations may be implemented in software or hardware. Execution of any of the instructions may be performed by hardware, as well as by software. Whether a hardware implementation or a software implementation is employed for a certain machine function depends on non-technical factors such as price, speed, reliability, storage capacity, change period, and the like. A software implementation and a hardware implementation are equivalent for the skilled person. The skilled person can choose software or hardware to implement the above described scheme as desired. Therefore, specific software or hardware is not limited herein.

The present invention may be an apparatus, method and/or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present invention.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present invention may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present invention are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, which can execute the computer-readable program instructions.

Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, by software, and by a combination of software and hardware are equivalent.

Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.

Claims

1. A method of machine translation, the method comprising:

obtaining a sentence to be translated;

wherein the machine translation model is obtained by:

and training according to the confrontation sample to obtain the machine translation model.

2. The method of claim 1, wherein the obtaining training samples comprises:

the source language is obtained and the source language is obtained,

mapping the word segmentation information and the pinyin information respectively, mapping the word segmentation information into the word segmentation characteristic vectors, and mapping the pinyin information into the pinyin characteristic vectors;

and forming the training sample by the word segmentation feature vector and the pinyin feature vector.

3. The method of claim 2, wherein generating the countermeasure sample for the source language in the training sample comprises:

determining a candidate word set;

4. The method of claim 2, wherein the forming the training sample from the segmented word feature vector and the pinyin feature vector comprises:

and performing fusion calculation on the word segmentation feature vector and the pinyin feature vector to obtain the training sample.

5. The method of claim 3, wherein said training from challenge samples results in a machine translation model comprising:

6. The method of claim 5, wherein the loss function is a cumulative sum of a source language loss term, a participle information loss term, and a pinyin information loss term.

7. A machine translation apparatus, the apparatus comprising:

the acquisition module is used for acquiring the statement to be translated;

and the training unit is used for training according to the confrontation sample to obtain the machine translation model.

8. The apparatus according to claim 7, wherein the generating unit is specifically configured to:

9. A machine translation apparatus, the apparatus comprising:

a memory and a processor; the memory is configured to store executable instructions and the processor is configured to perform operations in the machine translation method of any of claims 1-6 under control of the instructions.

10. A computer-readable storage medium, characterized in that the storage medium has stored thereon a computer program which, when being executed by a processor, carries out the machine translation method according to any one of claims 1-6.