CN111382577A

CN111382577A - Document translation method, device, electronic equipment and storage medium

Info

Publication number: CN111382577A
Application number: CN202010166968.7A
Authority: CN
Inventors: 王明轩; 孙泽维; 李磊
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2020-03-11
Filing date: 2020-03-11
Publication date: 2020-07-07
Anticipated expiration: 2040-03-11
Also published as: CN111382577B

Abstract

The embodiment of the disclosure discloses a document translation method, a document translation device, an electronic device and a storage medium, wherein the method comprises the following steps: obtaining a source document of a source language to be translated, wherein the source document comprises a plurality of sentences; inputting the source document into a pre-trained document translation model to obtain a target document of a target language output by the document translation model, wherein the document translation model can directly translate the source document into the target document instead of translating the sentences of the source document one by one. The technical scheme of the embodiment of the disclosure can perform full-text translation by taking the document as a unit, so that the machine learning model can consider the semantics of the vocabulary in the whole text, and the translation is more accurate.

Description

Document translation method, device, electronic equipment and storage medium

Technical Field

The embodiment of the disclosure relates to the technical field of natural language processing, in particular to a document translation method, a document translation device, electronic equipment and a storage medium.

Background

Machine translation research how to automatically realize interconversion between different languages by using a computer is an important research field of natural language processing and artificial intelligence. Currently, a widely adopted method is sentence-to-sentence level translation.

Disclosure of Invention

In view of this, embodiments of the present disclosure provide a document translation method, an apparatus, an electronic device, and a storage medium, so as to implement full-text translation in units of documents.

Additional features and advantages of the disclosed embodiments will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosed embodiments.

In a first aspect of the present disclosure, an embodiment of the present disclosure provides a document translation method, including: obtaining a source document of a source language to be translated, wherein the source document comprises a plurality of sentences; inputting the source document into a pre-trained document translation model to obtain a target document of a target language output by the document translation model, wherein the document translation model can directly translate the source document into the target document instead of translating the sentences of the source document one by one.

In a second aspect of the present disclosure, an embodiment of the present disclosure further provides a document translation apparatus, including: a source document obtaining unit, configured to obtain a source document of a source language to be translated, where the source document includes a plurality of sentences; and the target document acquisition unit is used for inputting the source document into a pre-trained document translation model so as to acquire a target document of a target language output by the document translation model, and the document translation model can directly translate the source document into the target document instead of translating the sentences of the source document sentence by sentence.

In a third aspect of the disclosure, an electronic device is provided. The electronic device includes: a processor; and a memory for storing executable instructions that, when executed by the processor, cause the electronic device to perform the method of the first aspect.

In a fourth aspect of the disclosure, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the method in the first aspect.

According to the method and the device for translating the source documents, the source documents of the source languages to be translated are obtained, the source documents are input into the pre-trained document translation model, so that the target documents of the target languages output by the document translation model are obtained, the document translation model can translate the source documents into the target documents directly, and the plurality of sentences of the source documents are not translated sentence by sentence, so that the machine learning model can consider the semantics of the vocabularies in the whole text, and the translation is more accurate.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure, the drawings needed to be used in the description of the embodiments of the present disclosure will be briefly described below, and it is obvious that the drawings in the following description are only a part of the embodiments of the present disclosure, and for those skilled in the art, other drawings can be obtained according to the contents of the embodiments of the present disclosure and the drawings without creative efforts.

FIG. 1 is a flowchart illustrating a document translation method according to an embodiment of the present disclosure;

FIG. 2a is a diagram of an exemplary sentence-level translation effect;

FIG. 2b is a diagram of another exemplary sentence-level translation effect;

FIG. 2c is a diagram of yet another exemplary sentence-level translation effect;

FIG. 3 is a diagram showing the comparison of the translation effect between a document translated by the document translation method according to the present embodiment and a document translated by the sentence level translation method;

FIG. 4 is a flowchart diagram of an exemplary document translation model training method provided by an embodiment of the present disclosure;

FIG. 5 is a flowchart of an exemplary method for training a resulting document translation model according to an embodiment of the present disclosure;

FIG. 6 is a schematic structural diagram of a document translation apparatus according to an embodiment of the present disclosure;

FIG. 7 is a block diagram illustrating an exemplary training module of a document translation model according to an embodiment of the present disclosure;

FIG. 8 shows a schematic structural diagram of an electronic device suitable for use in implementing embodiments of the present disclosure.

Detailed Description

In order to make the technical problems solved, technical solutions adopted and technical effects achieved by the embodiments of the present disclosure clearer, the technical solutions of the embodiments of the present disclosure will be described in further detail below with reference to the accompanying drawings, and it is obvious that the described embodiments are only some embodiments, but not all embodiments, of the embodiments of the present disclosure. All other embodiments, which can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present disclosure, belong to the protection scope of the embodiments of the present disclosure.

It should be noted that the terms "system" and "network" are often used interchangeably in the embodiments of the present disclosure. Reference to "and/or" in embodiments of the present disclosure is meant to include any and all combinations of one or more of the associated listed items. The terms "first", "second", and the like in the description and claims of the present disclosure and in the drawings are used for distinguishing between different objects and not for limiting a particular order.

It should also be noted that, in the embodiments of the present disclosure, each of the following embodiments may be executed alone, or may be executed in combination with each other, and the embodiments of the present disclosure are not limited specifically.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

The technical solutions of the embodiments of the present disclosure are further described by the following detailed description in conjunction with the accompanying drawings.

Fig. 1 shows a flowchart of a document translation method provided by an embodiment of the present disclosure, where the embodiment is applicable to a case where full-text translation is performed in units of documents, and the method can be executed by a document translation apparatus configured in an electronic device, as shown in fig. 1, the document translation method according to the embodiment includes:

in step S110, a source document in a source language to be translated is obtained, the source document including a plurality of sentences.

In step S120, the source document is input to a pre-trained document translation model to obtain a target document in a target language output by the document translation model, and the document translation model can translate the source document into the target document directly, instead of translating the sentences of the source document sentence by sentence. It should be noted that the source language and the target language belong to different languages.

In order to better explain the technical solution of the present embodiment, before describing the technical solution of the present embodiment, a review of the current document translation method may be made. At present, the method generally adopted in the industry is to divide a document into a plurality of sentences, and then translate the sentences at the level; rather than directly translating the entire document. The reason is that due to the lack of large-scale parallel documents for training, the industry has generally overlooked document-level translations, and has consistently focused on sentence-level translations. However, for a document, the meaning of the words in the sentence is often determined by the relation, development and comprehensive visual angle, but not by the isolated, static and one-sided visual angle, and the translation inconsistency or the translation error occurs.

For example, fig. 2a is a schematic diagram of an exemplary sentence-level translation effect, where the upper part is a source document of chinese to be translated, and the lower part is an english target document obtained by a sentence-level translation method, as can be seen from fig. 2a, a plurality of sentences in the source document all have the word of the name "Shuke", and should be translated into a unified target vocabulary in the same document, and since the sentence-level translation tool adopts the sentence-level translation method, each sentence translation process is administrative, and the translation results of the word "Shuke" are not unified, and it is obviously unreasonable to see the words "Shuke" and "Shuk" in the target document.

For another example, fig. 2b is another exemplary sentence-level schematic diagram of translation effect, where the upper part is a source document of chinese to be translated, and the lower part is an english target document obtained by a sentence-level translation method, and we know that the expressions of the general principles are different in countries with different systems, for example, the general cause of british, russia, etc., the "general principle" shall be referred to as "Prime miner", the "general principle" of germany shall be referred to as "chancello", the "general principle" of china shall be referred to as "Premier", as seen in fig. 2b, the first sentence "german general principle" in the source document is translated into "the german Chancellor", and the "general principle tencel" describing the general cause general principle in the source document shall also be referred to as "term as Chancellor" in the same context. Since the sentence-level translation method does not consider the context of the whole document, the translation method is separated from the context of the previous sentence and translated into 'prime resistor', so that the translation error occurs.

For another example, fig. 2c is a schematic diagram of a sentence-level translation effect, in which the upper part is a source document of chinese to be translated, and the lower part is an english target document obtained by a sentence-level translation method. From the first sentence of the source document, it can be seen that the context of the second and third sentences should be submitted to the court of application, with time occurring in the past, so the word "want" for the second and first sentences should be translated into a past tense rather than a present tense. Since the sentence-level translation method only considers the current sentence and does not consider other sentences of the document, and the "want" is translated into the "wants" apart from the context of the previous sentence, the translation problem is difficult to avoid for the sentence-level translation method.

In the method of this embodiment, a document is taken as a unit, a source document is input into a pre-trained document translation model and translated to obtain a target document, and the source document is directly translated to the target document instead of translating the sentences of the source document one by one, so that an unexpected technical effect is achieved, as shown in fig. 3. The reason is that more context information, such as consistency of name context, tense of the next diagram, etc., can be considered in the document level.

FIG. 3 is a diagram illustrating comparison of the translation effect between a document translated by the document translation method according to the present embodiment and a document translated by the sentence-level translation method, where the first row of the table in FIG. 3 is two source documents, the second row is a target document obtained by translating the two source documents by the sentence-level translation method, and the third row is a target document obtained by translating the two source documents by the document translation method according to the present embodiment. As shown in fig. 3, by adopting the document translation method described in this embodiment, the meaning of the vocabulary in the sentence is determined by using a connected, developed and comprehensive perspective, rather than an isolated, static and one-sided perspective, so that at least the technical problems of fig. 2b and fig. 2c, such as inconsistent vocabulary translation, errors in vocabulary translation and errors in temporal translation of actions due to context, are overcome, and the translation quality can be significantly improved.

The document translation model described in this embodiment belongs to a document-level translation model, and its specific training method includes multiple types, which is not limited in this embodiment. For example, FIG. 4 is an exemplary training method, and as shown in FIG. 4, the document translation model may be trained by:

in step S410, a set of training document pairs is obtained, wherein a training document pair comprises a first document in the source language and a second document in the target language.

In step S420, an initialized document translation model is determined, wherein the initialized document translation model includes a target layer for outputting a translation result document.

In step S430, a first document of a training document pair in the set of training document pairs is used as an input of the initialized document translation model, and a second document of the training document pair is used as an expected output of the initialized document translation model, and the document translation model is obtained through training.

Taking the initialized document translation model as an example of a seq2seq (sequence-to-sequence) model, taking the initialized document translation model including at least two encoder layers and at least one decoder layer, taking a first document in a training document pair in the set of training document pairs as an input of the initialized document translation model, and taking a second document in the training document pair as an expected output of the initialized document translation model, and the method shown in fig. 5 may be adopted to train the document translation model, as shown in fig. 5, and step S430 in fig. 4 may further include:

in step S510, the input document vectors are respectively input into at least one encoder layer for processing, so as to form hidden layer statement vectors.

In step S520, the document vector and/or the hidden layer sentence vector are input into at least one encoder layer for processing, so as to form a hidden layer vocabulary vector.

In step S530, the hidden layer vocabulary vector and the hidden layer sentence vector are input into at least one decoder layer for processing, so as to generate an output document vector.

In step S540, the initialized document translation model is parameter-adjusted according to the difference information between the output document vector and the expected output of the input document vector, so as to train the document translation model.

In the embodiment, the source document of the source language to be translated is acquired, and the source document is input into the pre-trained document translation model to acquire the target document of the target language output by the document translation model, and the document translation model can directly translate the source document into the target document instead of translating the sentences of the source document one by one, so that the machine learning model can consider the semantics of the vocabulary in the whole text and translate more accurately.

As an implementation of the methods shown in the above figures, the present application provides an embodiment of a document translation apparatus, and fig. 6 shows a schematic structural diagram of a document translation apparatus provided in this embodiment, where the apparatus embodiment corresponds to the method embodiments shown in fig. 1 to fig. 5, and the apparatus may be applied to various electronic devices. As shown in fig. 6, the document translation apparatus according to the present embodiment includes a source document acquisition unit 610 and a target document acquisition unit 620.

The source document obtaining unit 610 is configured to obtain a source document of a source language to be translated, the source document including a plurality of sentences.

The target document obtaining unit 620 is configured to input the source document into a pre-trained document translation model to obtain a target document in a target language output by the document translation model, where the document translation model is capable of directly translating the source document into the target document instead of translating the plurality of sentences of the source document sentence by sentence.

Further, fig. 7 provides a schematic structural diagram of an exemplary training module of the document translation model, and as shown in fig. 7, the modules for training the document translation model include a sample obtaining module 710, a model determining module 720, and a model training module 730.

The sample acquisition module 710 is configured to acquire a set of training document pairs, wherein a training document pair comprises a first document in the source language and a second document in the target language.

The model determination module 720 is configured for determining an initialized document translation model, wherein the initialized document translation model comprises a target layer for outputting a translation result document.

The model training module 730 is configured to train the first document in the training document pair set as the input of the initialized document translation model and the second document in the training document pair as the expected output of the initialized document translation model by using a machine learning method to obtain the document translation model.

In an embodiment, the initialized document translation model is a seq2seq model.

In an embodiment, the initialized document translation model includes at least two encoder layers and at least one decoder layer, and in this structure, the model training module 730 may further include a first encoding sub-module 731, a second encoding sub-module 732, a decoding sub-module 733, and a model adjusting sub-module 734.

The first encoding submodule 731 is configured to input the input document vectors into at least one encoder layer respectively for processing, so as to form a hidden layer statement vector.

The second encoding submodule 732 is configured to input the document vector and/or the hidden-layer sentence vector into at least one encoder layer for processing to form a hidden-layer vocabulary vector.

The decoding sub-module 733 is configured to input the hidden-layer vocabulary vectors and the hidden-layer sentence vectors to at least one decoder layer for processing to generate output document vectors.

The model adjusting sub-module 734 is configured to perform parameter adjustment on the initialized document translation model according to the difference information between the output document vector and the expected output of the input document vector, so as to train the document translation model.

The document translation device provided by the embodiment can execute the document translation method provided by the embodiment of the method disclosed by the invention, and has corresponding functional modules and beneficial effects of the execution method.

Referring now to FIG. 8, shown is a schematic diagram of an electronic device 600 suitable for use in implementing embodiments of the present disclosure. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 8, an electronic device 800 may include a processing means (e.g., central processing unit, graphics processor, etc.) 801 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage means 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data necessary for the operation of the electronic apparatus 800 are also stored. The processing apparatus 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

Generally, the following devices may be connected to the I/O interface 805: input devices 806 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 807 including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, and the like; storage 808 including, for example, magnetic tape, hard disk, etc.; and a communication device 809. The communication means 809 may allow the electronic device 800 to communicate wirelessly or by wire with other devices to exchange data. While fig. 8 illustrates an electronic device 800 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication means 809, or installed from the storage means 808, or installed from the ROM 802. The computer program, when executed by the processing apparatus 801, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium described above in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the disclosed embodiments, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the disclosed embodiments, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to:

obtaining a source document of a source language to be translated, wherein the source document comprises a plurality of sentences;

inputting the source document into a pre-trained document translation model to obtain a target document of a target language output by the document translation model, wherein the document translation model can directly translate the source document into the target document instead of translating the sentences of the source document one by one.

Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of a unit does not in some cases constitute a limitation of the unit itself, for example, the first retrieving unit may also be described as a "unit for retrieving at least two internet protocol addresses".

According to one or more embodiments of the present disclosure, in the document translation method, the document translation model is trained by the following steps: acquiring a training document pair set, wherein the training document pair comprises a first document in the source language and a second document in the target language; determining an initialized document translation model, wherein the initialized document translation model comprises a target layer for outputting a translation result document; and taking a first document of a training document pair in the training document pair set as the input of the initialized document translation model, taking a second document of the training document pair as the expected output of the initialized document translation model, and training to obtain the document translation model.

According to one or more embodiments of the present disclosure, in the document translation method, the initialized document translation model is a seq2seq model.

According to one or more embodiments of the present disclosure, in the document translation method: the initialized document translation model comprises at least two encoder layers and at least one decoder layer; taking a first document in a training document pair in the training document pair set as an input of the initialized document translation model, taking a second document in the training document pair as an expected output of the initialized document translation model, and training to obtain the document translation model comprises: respectively inputting the input document vectors into at least one encoder layer for processing to form hidden layer statement vectors; inputting the document vector and/or the hidden layer statement vector into at least one encoder layer for processing to form a hidden layer vocabulary vector; inputting the hidden layer vocabulary vectors and the hidden layer sentence vectors into at least one decoder layer for processing so as to generate output document vectors; and carrying out parameter adjustment on the initialized document translation model according to the difference information between the output document vector and the expected output of the input document vector so as to train the document translation model.

According to one or more embodiments of the present disclosure, in the document translation apparatus, the document translation model is trained by: the sample acquisition module is used for acquiring a training document pair set, wherein the training document pair comprises a first document in the source language and a second document in the target language; the model determination module is used for determining an initialized document translation model, wherein the initialized document translation model comprises a target layer used for outputting a translation result document; and the model training module is used for training to obtain the document translation model by taking a first document in a training document pair in the training document pair set as the input of the initialized document translation model and taking a second document in the training document pair as the expected output of the initialized document translation model by using a machine learning method.

According to one or more embodiments of the present disclosure, in the document translation apparatus, the initialized document translation model is a seq2seq model.

According to one or more embodiments of the present disclosure, in the document translation apparatus, the initialized document translation model includes at least two encoder layers and at least one decoder layer; the model training module further comprises: the first coding submodule is used for respectively inputting the input document vectors into at least one coder layer to be processed so as to form hidden layer statement vectors; the second coding submodule is used for inputting the document vector and/or the hidden layer statement vector into at least one coder layer for processing so as to form a hidden layer vocabulary vector; the decoding submodule is used for inputting the hidden layer vocabulary vector and the hidden layer statement vector into at least one decoder layer for processing so as to generate an output document vector; and the model adjusting submodule is used for carrying out parameter adjustment on the initialized document translation model according to the difference information between the output document vector and the expected output of the input document vector so as to train and obtain the document translation model.

The foregoing description is only a preferred embodiment of the disclosed embodiments and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure in the embodiments of the present disclosure is not limited to the particular combination of the above-described features, but also encompasses other embodiments in which any combination of the above-described features or their equivalents is possible without departing from the scope of the present disclosure. For example, the above features and (but not limited to) the features with similar functions disclosed in the embodiments of the present disclosure are mutually replaced to form the technical solution.

Claims

1. A method of document translation, comprising:

2. The method of claim 1, wherein the document translation model is trained by:

acquiring a training document pair set, wherein the training document pair comprises a first document in the source language and a second document in the target language;

determining an initialized document translation model, wherein the initialized document translation model comprises a target layer for outputting a translation result document;

and taking a first document of a training document pair in the training document pair set as the input of the initialized document translation model, taking a second document of the training document pair as the expected output of the initialized document translation model, and training to obtain the document translation model.

3. The method of claim 2, wherein the initialized document translation model is a seq2seq model.

4. The method of claim 3, wherein:

the initialized document translation model comprises at least two encoder layers and at least one decoder layer;

taking a first document in a training document pair in the training document pair set as an input of the initialized document translation model, taking a second document in the training document pair as an expected output of the initialized document translation model, and training to obtain the document translation model comprises:

respectively inputting the input document vectors into at least one encoder layer for processing to form hidden layer statement vectors;

inputting the document vector and/or the hidden layer statement vector into at least one encoder layer for processing to form a hidden layer vocabulary vector;

inputting the hidden layer vocabulary vectors and the hidden layer sentence vectors into at least one decoder layer for processing so as to generate output document vectors;

and carrying out parameter adjustment on the initialized document translation model according to the difference information between the output document vector and the expected output of the input document vector so as to train the document translation model.

5. A document translation apparatus, comprising:

a source document obtaining unit, configured to obtain a source document of a source language to be translated, where the source document includes a plurality of sentences;

and the target document acquisition unit is used for inputting the source document into a pre-trained document translation model so as to acquire a target document of a target language output by the document translation model, and the document translation model can directly translate the source document into the target document instead of translating the sentences of the source document sentence by sentence.

6. The apparatus of claim 5, wherein the document translation model is trained by:

the sample acquisition module is used for acquiring a training document pair set, wherein the training document pair comprises a first document in the source language and a second document in the target language;

the model determination module is used for determining an initialized document translation model, wherein the initialized document translation model comprises a target layer used for outputting a translation result document;

and the model training module is used for training to obtain the document translation model by taking a first document in a training document pair in the training document pair set as the input of the initialized document translation model and taking a second document in the training document pair as the expected output of the initialized document translation model by using a machine learning method.

7. The apparatus of claim 6, wherein the initialized document translation model is a seq2seq model.

8. The apparatus of claim 7, wherein: the initialized document translation model comprises at least two encoder layers and at least one decoder layer;

the model training module further comprises:

the first coding submodule is used for respectively inputting the input document vectors into at least one coder layer to be processed so as to form hidden layer statement vectors;

the second coding submodule is used for inputting the document vector and/or the hidden layer statement vector into at least one coder layer for processing so as to form a hidden layer vocabulary vector;

the decoding submodule is used for inputting the hidden layer vocabulary vector and the hidden layer statement vector into at least one decoder layer for processing so as to generate an output document vector;

and the model adjusting submodule is used for carrying out parameter adjustment on the initialized document translation model according to the difference information between the output document vector and the expected output of the input document vector so as to train and obtain the document translation model.

9. An electronic device, comprising:

a processor; and

a memory to store executable instructions that, when executed by the one or more processors, cause the electronic device to perform the method of any of claims 1-4.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-4.