CN113569585A

CN113569585A - Translation method and device, storage medium and electronic equipment

Info

Publication number: CN113569585A
Application number: CN202110212812.2A
Authority: CN
Inventors: 梁云龙; 孟凡东; 徐金安; 陈钰枫
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-02-25
Filing date: 2021-02-25
Publication date: 2021-10-29

Abstract

The invention discloses a translation method and device based on artificial intelligence, a storage medium and electronic equipment. Wherein, the method comprises the following steps: in the conversation process of a first object and a second object, acquiring an original sentence, to be translated, of a first language, currently input by the first object; acquiring historical sentences associated with original sentences; encoding an original statement and a historical statement to obtain a first target vector, a second target vector and a third target vector, wherein the first target vector is used for expressing role preference of a first object, the second target vector is used for expressing the sequence of the historical statement, and the third target vector is used for expressing content association between the historical statements; splicing the first target vector, the second target vector and the third target vector into a combined vector; and decoding the combined vector to obtain a translation result of the second language after the original sentence is translated. The invention solves the technical problems that the translation result does not have role characteristics and conversation continuity and the accuracy of the translation result is low.

Description

Translation method and device, storage medium and electronic equipment

Technical Field

The invention relates to the field of computers, in particular to a translation method and device, a storage medium and electronic equipment.

Background

In the prior art, in the process of translating sentences, for example, translating a sentence in a first language into a sentence in a second language, generally, the role preference, the continuity between sentences and the content correlation between sentences in the translation process are not noticed, so that the translated sentence in the second language has low accuracy.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the invention provides a translation method and device, a storage medium and electronic equipment, which at least solve the technical problems that translation results do not have role characteristics and conversation continuity and are low in accuracy.

According to an aspect of an embodiment of the present invention, there is provided a translation method including: in the conversation process of a first object and a second object, acquiring an original sentence of a first language to be translated, which is currently input by the first object; acquiring a history sentence associated with the original sentence, wherein the history sentence comprises a sentence generated by the first object and a sentence generated by the second object in a history conversation process between the first object and the second object; encoding the original sentence and the historical sentence to obtain a first target vector, a second target vector and a third target vector, wherein the first target vector is used for indicating the role preference of the first object, the second target vector is used for indicating the sequence of the historical sentences, and the third target vector is used for indicating the content association among the historical sentences; splicing the first target vector, the second target vector and the third target vector into a combined vector; and decoding the combined vector to obtain a translation result of the second language translated by the original sentence.

According to another aspect of the embodiments of the present invention, there is also provided a translation apparatus, including: a first obtaining unit, configured to obtain, in a session between a first object and a second object, an original sentence, in a first language to be translated, currently input by the first object; a second obtaining unit configured to obtain a history sentence related to the original sentence, wherein the history sentence includes a sentence generated by the first object and a sentence generated by the second object in a history conversation process between the first object and the second object; encoding means for encoding the original sentence and the historical sentence to obtain a first target vector indicating a role preference of the first object, a second target vector indicating an order of the historical sentences, and a third target vector indicating a content relation between the historical sentences; a splicing unit configured to splice the first target vector, the second target vector, and the third target vector into a combined vector; and the decoding unit is used for decoding the combined vector to obtain a translation result of the second language after the original sentence is translated.

As an alternative example, the splicing unit includes: a concatenation module configured to concatenate the first target vector, the second target vector, the third target vector, and the coded vector into the combined vector when the original sentence includes the coded vector, where the coded vector is a word-embedded vector of each word preceding a first word of the original sentence when any word following the first word of the original sentence is translated.

According to yet another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium, in which a computer program is stored, wherein the computer program is configured to be executed by a processor to perform the translation method described above.

According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including a memory and a processor, where the memory stores a computer program, and the processor is configured to execute the translation method through the computer program.

In the embodiment of the invention, the original sentence of the first language to be translated, which is currently input by a first object, is obtained in the conversation process of the first object and a second object; acquiring a historical statement associated with the original statement; encoding the original sentence and the historical sentence to obtain a first target vector, a second target vector and a third target vector, wherein the first target vector is used for indicating the role preference of the first object, the second target vector is used for indicating the sequence of the historical sentences, and the third target vector is used for indicating the content association among the historical sentences; splicing the first target vector, the second target vector and the third target vector into a combined vector; in the method, in the process of translating the sentences, the original sentences are encoded to obtain a first target vector, a second target vector and a third target vector, wherein the first target vector is used for expressing the role preference of the first object, the second target vector is used for expressing the sequence of the historical sentences, the third target vector is used for expressing the content association among the historical sentences, so that the role preference, the sentence sequence and the content association of the original sentences can be obtained, and then the combined vector of the first target vector, the second target vector and the third target vector is decoded to obtain the translation result, thereby realizing the effect of improving the translation accuracy and further solving the problem that the translation result does not have the role characteristics and the conversation continuity, the translation result has low accuracy.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

FIG. 1 is a schematic diagram of an application environment for an alternative translation method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an application environment of an alternative translation method according to an embodiment of the present invention;

FIG. 3 is a schematic illustration of the flow of an alternative translation method according to an embodiment of the invention;

FIG. 4 is a diagram illustrating a translation scenario of an alternative translation method according to an embodiment of the present invention;

FIG. 5 is a diagram of a translation scenario of an alternative translation method according to an embodiment of the present invention;

FIG. 6 is a diagram of historical statements of an alternative translation method according to an embodiment of the invention;

FIG. 7 is a schematic diagram of a target neural network model for an alternative translation method according to embodiments of the present invention;

FIG. 8 is a schematic diagram of the data processing layers of an alternative translation method according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of an alternative translation apparatus according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of an alternative electronic device according to an embodiment of the invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.

According to an aspect of the embodiments of the present invention, there is provided a translation method, which may be applied, but not limited to, in the environment shown in fig. 1 as an optional implementation manner.

As shown in fig. 1, the terminal device 102 includes a memory 104 for storing various data generated during the operation of the terminal device 102, a processor 106 for processing and operating the various data, and a display 108 for displaying the original sentence and the translation result. Terminal device 102 may interact with server 112 via network 110. Server 112 includes a database 114 for storing various data items and a processing engine 116 for processing the various data items. Step S102 to step S106, the terminal device 102 sends the original sentence to the server 112, and the server 112 translates the original sentence to obtain a translation result and sends the translation result to the terminal device 102.

As an alternative embodiment, the translation method described above may be applied, but not limited to, in the environment shown in FIG. 2.

As shown in fig. 2, the terminal device 202 includes a memory 204 for storing various data generated during the operation of the terminal device 202, a processor 206 for processing and operating the various data, and a display 208 for displaying the original sentence and the translation result. The terminal device 202 may execute step S202, that is, the terminal device 202 completes the translation of the original sentence to obtain the translation result.

Optionally, in this embodiment, the terminal device may be a terminal device configured with a target client, and may include, but is not limited to, at least one of the following: mobile phones (such as Android phones, iOS phones, etc.), notebook computers, tablet computers, palm computers, MID (Mobile Internet Devices), PAD, desktop computers, smart televisions, etc. The target client may be a video client, an instant messaging client, a browser client, an educational client, etc. Such networks may include, but are not limited to: a wired network, a wireless network, wherein the wired network comprises: a local area network, a metropolitan area network, and a wide area network, the wireless network comprising: bluetooth, WIFI, and other networks that enable wireless communication. The server may be a single server, a server cluster composed of a plurality of servers, or a cloud server. The above is merely an example, and this is not limited in this embodiment.

Optionally, as an optional implementation manner, as shown in fig. 3, the translation method includes:

s302, in the conversation process of the first object and the second object, acquiring an original sentence, to be translated, of the first language, currently input by the first object;

s304, obtaining historical sentences associated with the original sentences, wherein the historical sentences comprise sentences generated by the first object and sentences generated by the second object in the historical conversation process between the first object and the second object;

s306, encoding the original statement and the historical statements to obtain a first target vector, a second target vector and a third target vector, wherein the first target vector is used for expressing role preference of a first object, the second target vector is used for expressing the sequence of the historical statements, and the third target vector is used for expressing content association among the historical statements;

s308, splicing the first target vector, the second target vector and the third target vector into a combined vector;

s310, decoding the combined vector to obtain a translation result of the second language translated by the original sentence.

Alternatively, the above translation method can be applied to, but not limited to, a process of translating a sentence. The language of the sentence is not limited. For example, the language may be a language used in any country or region. But not limited to, during a conversation with two or more parties. Two or more parties may each use different languages, or if a conversation is made between two or more parties, at least two parties may use the same language, and the other parties use different languages.

Optionally, the translation method may be applied to a client, and in a process that different users use the client to perform communication or instant messaging, the method in this embodiment may be used to perform translation. Alternatively, the client may be used to translate the conversation content during the face-to-face communication between different users. The client may be a client that receives and translates two or more communication contents, or two users may use one client to translate the communication contents.

For example, taking two users as an example to use one client to perform translation, as shown in fig. 4, two users use one client to perform translation, and the client receives an original sentence of each user, and then translates to obtain a translation result, and prompts the translation result. When the original sentence of one user is translated into the translation result, the user is the first object, and the other user is the second object. The client receives the first language statement of user a and then translates it into a translated second language statement 402. The client receives the second language statement of user B and then translates it into the translated first language statement 404. User a and user B may communicate through the client.

Alternatively, for example, a plurality of users may use their respective clients to perform translation, as shown in fig. 5. In fig. 5. Three users interact with each other, and each user of the three users uses one client. Three users may use different languages. For the user a, after the first language sentence is sent to the user B and the user C, the client sides of the user B and the user C translate the second language sentence and the third language sentence corresponding to the first language sentence. And the client of user a translates the second language statement of user B and the third language statement of user C into the first language statement.

Optionally, in this embodiment of the present application, the object identities of the first object and the second object are not limited. For example, the system can be a tour guide and a tourist, a store owner and a customer, and the like. The translation method of the embodiment of the present application can be used as long as sentences of two languages are used for both. In the embodiment of the present application, the manner of obtaining the original sentence is not limited. For example, an original sentence in the form of an input text may be acquired, or an original sentence in the form of an input voice may be acquired, or the like.

The history statements in the embodiment of the present application may be statements generated in the conversation process between the first object and the second object before the original statements. May include a first history statement in a first language generated by the first object, a second history statement in a second language generated by the second object, a first translation statement in the first history statement, and a second translation statement in the second history statement, the first translation statement being a statement in the second language, the second translation statement being a statement in the first language.

Through the embodiment, by the method, in the process of translating the original sentence, the original sentence is encoded to obtain the first target vector, the second target vector and the third target vector, wherein the first target vector is used for representing the role preference of the first object, the second target vector is used for representing the sequence of the historical sentences, and the third target vector is used for representing the content association among the historical sentences, so that the role preference, the sentence sequence and the content association of the sentences of the original sentence can be obtained, and then the combined vector of the first target vector, the second target vector and the third target vector is decoded to obtain the translation result, thereby achieving the effect of improving the accuracy of translation.

As an optional implementation, encoding the original sentence and the historical sentence to obtain a first target vector, a second target vector, and a third target vector includes:

acquiring a first statement, a second statement, a third statement and a fourth statement, wherein the first statement is a first historical statement, the second statement is a first historical statement and a second translation statement, the third statement is a historical statement, and the fourth statement is an original statement;

acquiring a first vector, a second vector, a third vector and a fourth vector, wherein the first vector is a low latitude vector mapped by a first statement, the second vector is a low latitude vector mapped by a second statement, the third vector is a low latitude vector mapped by a third statement, and the fourth vector is a low latitude vector mapped by a fourth statement;

the first vector is encoded as a first target vector, the second vector and the fourth vector are encoded as a second target vector, and the third vector is encoded as a third target vector.

Optionally, the first statement, the second statement, and the third statement mentioned in this embodiment may be statements obtained from historical statements. The fourth sentence may be the original sentence. Specifically, of the history sentences, a first history sentence with a first role is taken as a first sentence, the first history sentence and a second translation sentence are taken as a second sentence, all contents of the history sentences are taken as a third sentence, and an original sentence is taken as a fourth sentence. The Embedding layer may be used to map the first to fourth statements to vectors of low latitude, resulting in first to fourth vectors. And encoding the first vector as a first target vector, the second vector and the fourth vector as a second target vector, and the third vector as a third target vector.

By the embodiment, the first target vector, the second target vector and the third target vector can be obtained from the original sentence of the first object and the historical sentence of the conversation between the first object and the second object, so that the role preference of the first object, the sequence of the sentences and the content association between the sentences can be determined.

As an alternative embodiment, encoding the first vector as a first target vector, encoding the second vector and the fourth vector as a second target vector, and encoding the third vector as a third target vector comprises:

encoding the first vector into a first target vector and the third vector into a third target vector using a first encoding layer;

and encoding the second vector and the fourth vector into a first intermediate vector and a second intermediate vector by using a first encoding layer, encoding the second intermediate vector into a result vector by using the second encoding layer, and combining the first intermediate vector and the result vector into a second target vector, wherein the first encoding layer comprises one encoding layer, and the second encoding layer comprises five encoding layers.

Alternatively, the first coding layer and the second coding layer may be, but are not limited to, coding layers of a target neural network. Different coding layers have different numbers of layers and encode different content. The first coding layer has a coding layer for coding the first vector and the third vector to obtain a first target vector and a third target vector. And the second vector and the fourth vector are coded to obtain a first intermediate vector and a second intermediate vector, and then the first intermediate vector and the second intermediate vector are coded by the second layer coding layer to obtain a second target vector. The second layer coding layer has five coding layers.

In this embodiment, the first to fourth vectors are encoded by using different encoding layers, so that the more accurate first target vector, second target vector, and third target vector can be obtained.

As an optional implementation, the stitching the first target vector, the second target vector, and the third target vector into a combined vector includes:

and in the case that the original sentence comprises a coding vector, splicing the first target vector, the second target vector, the third target vector and the coding vector into a combined vector, wherein the coding vector is a word embedding vector of each word before any word in the original sentence in the case that any word after the first word in the original sentence is translated.

Alternatively, when translating the original sentence, the original sentence may be translated sentence by sentence or word by word. For word-by-word translation, when any word in the original sentence is translated, a word embedding vector of each word before any word in the original sentence is used as a coding vector, and then the coding vector is spliced with the first target vector, the second target vector and the third target vector to obtain a combined vector. And finally, decoding the combined vector to obtain a translation result of the original sentence.

In this embodiment, when any word in the original sentence is translated, the encoding vector is added, so that each word before any word affects any currently translated word, thereby improving the accuracy of translation. The situation where words in the same first language are translated into words in a different second language is also avoided.

As an optional implementation, encoding the original sentence and the historical sentence to obtain a first target vector, a second target vector, and a third target vector includes: inputting the historical statement into a target neural network model, and extracting a first target vector, a second target vector and a third target vector by a feature extraction layer of the target neural network model; decoding the combined vector to obtain a translation result of the second language after the translation of the original sentence, including: and determining a result obtained by decoding the combined vector by using a decoding layer of the target neural network model as a translation result, wherein the target neural network model is a pre-trained model, and the feature extraction layer and the decoding layer comprise target parameters obtained after the original parameters are trained.

Optionally, the target neural network model in this embodiment may include a feature extraction layer and a decoding layer. The feature extraction layer is used for extracting features from the first vector to the fourth vector to obtain a first target vector, a second target vector and a third target vector. And the decoding layer is responsible for decoding the combined vector to obtain a translation result of the original statement. The feature extraction layer and the decoding layer both include parameters of the model. In the case of a training process, the parameters of the model are adjusted as the training progresses. If the method is used, the model parameters are generally fixed, and according to the model parameters, the first target vector to the third target vector can be obtained through encoding after the first vector to the fourth vector are input. And decoding the combined vector into a translation result. The target neural network model is a mature model obtained by training the original neural network model by using sample data. The target neural network model learns the role preference of the first object and the sentence sequence and the content association between the sentences, so the translation accuracy is high.

As an optional implementation, before inputting the historical statement into the target neural network model, and extracting the first target vector, the second target vector, and the third target vector by the feature extraction layer of the target neural network model, the method further includes:

acquiring an original neural network model, wherein the original neural network model comprises a feature extraction layer and a decoding layer, and the feature extraction layer and the decoding layer comprise original parameters;

obtaining a sample original sentence, a sample translation result and a sample historical sentence of the sample original sentence, wherein the sample original sentence is a sentence of a first language generated by a third object, the sample translation result is a sentence of a second language obtained by translating the sample original sentence, the sample historical sentence comprises a first sample historical sentence of the first language generated by the third object in a dialogue process between the third object and a fourth object, a second sample historical sentence of the second language generated by the fourth object, a first sample translation sentence of the first sample historical sentence and a second sample translation sentence of the second sample historical sentence, the first sample translation sentence is a sentence of the second language, and the second sample translation sentence is a sentence of the first language;

and training original parameters of the original neural network model by using the sample original sentences, the sample translation results and the sample historical sentences to obtain the target neural network model.

Optionally, the sample data in the embodiment of the present application may include a plurality of pieces. Each piece of sample data comprises a sample original statement, a sample translation result and a sample historical statement. The sample original sentence is a sentence in a first language, and the sample translation result is a sentence in a second language. The sample history statements include both statements in the first language and statements in the second language. That is, the sample data includes a statement generated by a conversation between two objects, and a translation result of the generated statement. However, the two objects may not be the first object and the second object described above. That is, a sentence record in which an arbitrary object has conversed using the first sentence and the second sentence may be used as the training sample.

As an alternative embodiment, training the original parameters of the original neural network model using the sample original sentences, the sample translation results and the sample historical sentences, and obtaining the target neural network model includes:

performing the following operations on the original neural network model until the recognition accuracy of the original neural network model is greater than a first threshold value to obtain a target neural network model:

inputting the sample translation result into a feature extraction layer to obtain a first target sample vector, a second target sample vector and a third target sample vector;

inputting the original sample statement into a feature extraction layer to obtain a first original sample vector, a second original sample vector and a third original sample vector;

combining the first original sample vector, the second original sample vector, the third original sample vector and a sample coding vector of a sample original sentence into a sample combination vector, wherein the sample coding vector is a word embedding vector of each word before any word in the sample original sentence under the condition that any word after the first word of the sample original sentence is translated;

decoding the sample combination vector by using a decoding layer to obtain an estimated statement;

and adjusting the value of the original parameter under the condition that the difference between the first target sample vector and the first original sample vector is larger than a first threshold value, or the difference between the second target sample vector and the second original sample vector is larger than a second threshold value, or the difference between the third target sample vector and the third original sample vector is larger than a third threshold value, or the difference between the sample original statement and the predicted statement is larger than a fourth threshold value.

That is, in the process of training the model, the original neural network model may receive first to fourth vectors obtained by mapping the first to fourth statements, and then encode the first to fourth vectors using the feature extraction layer to obtain first to third target vectors. And coding a sample translation result of a sample original statement in the sample data to obtain a first original sample vector, a second original sample vector and a third original sample vector. And decoding the combined vector into a prediction statement by using a decoding layer of the target neural network model, and constraining model parameters of a feature extraction layer and a decoding layer in the original neural network model through the first original sample vector, the second original sample vector, the third original sample vector and the prediction statement so as to play a training role. And finally, obtaining a trained mature target neural network model.

Is combined as followsA specific example is illustrated. In this embodiment, two objects are explained by taking an example of communication between each other. The two objects use respective languages. The first object is in a first language and the second object is in a second language. The first language may be english and the second language may be german, as shown in fig. 6, the first object producing 602-1, 606-1, and 610-1 in the first language, 602-1 and 606-1 being translated to 602-2 and 606-2 in the second language, 610-1 being the sentence to be translated to 610-2. While the second object produces 604-1 and 608-1 in the second language, translated into 604-2 and 608-2 in the first language. For the case of the above 610-2, the original sentence, the historical sentence and the translation result in fig. 6 can be used as a sample to train the original neural network model. If 610-2 above is unknown, the original sentence can be translated using the trained target neural network model to obtain the translation result. If the case 610-2 is known, then the data in FIG. 6 is used as training sample data, Role-Specific Contexts (c)₁) The statements in English that are generated for object A in the training sample data, namely 602-1 and 606-1 in FIG. 6. Coherent Chat Context (c)₂) The English language sentences generated for object A and the English language sentences translated from the German language sentences of object B are 602-1, 604-2, 606-1 and 608-2 in FIG. 6. And Contexts of Inter-linear relationships (c)₃) Historical conversation records and translation records of the historical conversation records generated for object a and object C, i.e., the sample historical statements described above, i.e., statements 602-1, 602-2 through 608-1, 608-2 in fig. 6. And Source Inputs (x) is the original statement of the sample to be translated in the sample data of object A, i.e. 610-1 in FIG. 6. And the sample translation results are known. If 610-2 is unknown, then Role-Specific Contexts (c)₁) For the first statement, Coherent Chat Context (c)₂) For the second statement, Contexts of Inter-linear relationships (c)₃) Is the third statement. The original text generated by the first object or the second object may be represented by Src, and the translation of the original text generated by the first object or the second object may be represented by Tgt. If the target sentence is a target sentence translated by the sentence to be translated, the target sentence can be represented by Ref.

And for the original sentence to be translated, translating the original sentence to obtain a translation result by using the target neural network model. The target neural network model is a pre-trained model. The structure of the target neural network model (or the original neural network model) is shown in fig. 7. In fig. 7, training and use of the model are explained.

Firstly, a training process needs to acquire training sample data. The training sample comprises a sample original sentence, a sample translation result and a sample historical sentence of the sample original sentence. The original sample sentence is a sentence to be translated in the training sample data, such as english, and the sample translation result is a result obtained by translating the sentence to be translated in the training sample data, such as translated german. The sample history statement is a history record before the sample original statement, and includes English and German statements. Specifically, the method comprises a first sample historical statement of English generated in the process of dialogue between a third object and a fourth object, a second sample historical statement of German generated by the fourth object, a first sample translation statement of the first sample historical statement and a second sample translation statement of the second sample historical statement, wherein the first sample translation statement is German, and the second sample translation statement is English. The third object may be the first object or not, and the fourth object may be the second object or not. And taking the sample original sentences, the sample historical sentences and the sample translation results as a training sample, and training the original neural network model until a target neural network model is obtained.

In the training process, a plurality of training samples may be obtained, and the training process of one training sample is taken as an example for explanation. A training sample comprises the sample original sentence, the sample historical sentence and the sample translation result.

First, the process contains a dialog history information section. I.e. data acquisition. And obtaining a first sample statement, a second sample statement, a third sample statement and a fourth sample statement from the sample original statement, the sample historical statement and the sample translation result. The first sample sentence is english generated by the third object in the above-mentioned history sentence. Second sampleThis sentence is english generated by the third object and english translated into german generated by the fourth object. The third sample statement includes English produced by the third object and translated German and English produced by the fourth object and translated English. I.e. all history sample statements. The fourth sample sentence is the sample original sentence generated for the third object, i.e. english to be translated into german. Corresponding to FIG. 7, the training process obtains the training sample data Role-Specific Contexts (c)₁)、Coherent Chat Context(c₂)、Contexts of Inter-Linguistic relationships(c₃) And Source Inputs (x). I.e., the first through fourth sample statements.

Taking fig. 6 as an example, the first to third sample statements are obtained as follows:

context for Role-Specific (c1):

the SEP can be regarded as "and" or "and", and the CLS is Common Language Specification (CLS). That is, the english language of the third object is acquired.

For Coherent Chat Context (c 2):

that is, english of the third object and german translated english of the fourth object are acquired.

For Context (c3) used to model inter-language relationships:

that is, english translation of english of the third object and german translation of german of the fourth object are acquired, and german translation of english of the third object and german of the fourth object are acquired. I.e. all sample history statements are taken.

Where x represents the original text to be translated, and s 0-sn-1 (src/tgt) represents a different form of dialogue history.

After learning by the Transformer, the following several representations were obtained:

source sentence representation (Masked-later representation and): h^x

Target sentence representation (sum of representation after Masked): h^y

Role-Specific Context representation (i.e. [ CLS ] corresponding hidden state representation)

Dialog-level Coherent Context representation (i.e., [ CLS ] corresponding hidden state representation)

Inter-Linear Context representation (i.e. [ CLS ] corresponding hidden state representation)

That is to say, after the first to fourth sample sentences are obtained, the information passes through the Embedding layers to obtain corresponding Token representations (where a part containing context information is added with one more Turn Embeddings and Role Embeddings), and fig. 8 is a schematic diagram of an optional Embedding layer, which includes four Embedding layers in total, that is, Token Embeddings (that is, Word Embeddings), Positional Embeddings, Turn Embeddings and Role Embeddings. At the embedding layer, for source/target input, token embedding and position embedding layers are used for processing, and for a part containing conversation history, additional Turn embedding and Role embedding layers are added for processing. + in fig. 8 is a summation operation. That is, for the first to third sample sentences, the four above-mentioned Embedding layers are passed through, and for the fourth sample sentence, the Word Embedding and Position Embedding layers are passed through to obtain the corresponding low latitude vector. I.e. the first to fourth sample vectors. The first through fourth sample vectors are then used to train the original neural network model. The original neural network model includes a feature extraction layer and a decoding layer. The first through fourth sample vectors are converted into first through third target sample vectors by the feature extraction layer.

For Coherent Chat Context + Source Context (Source Inputs (x)), a first coding layer, a transform Bottom Block layer and a second coding layer, wherein the Top Block layer only codes Source Inputs information as the input of a Decoder through MASK operation; and the Context information part indicates [ CLS ] of the transform Bottom Block layer used]The hidden layer state representation is used as Context representation and is used for sampling a hidden variable Z2; that is, in the training process, the second sample vector and the fourth sample vector are subjected to feature extraction by a feature extraction layer of the original neural network model, so as to obtain a second target sample vector Z2. Context representation for modeling specific role preferences and inter-language relationships with [ CLS ] passing through the transform Bottom Block layer only]Hidden state representation provides for sampling hidden variables Z1 and Z3. That is, for Role-Specific Contexts (c)₁) And Contexts of Inter-linear relationships (c)₃) And respectively obtaining a first target sample vector Z1 and a third target sample vector Z3 (corresponding to Z1 to Z3 output by the feature extraction layer 702 in FIG. 7) through the first coding layer coding of the feature extraction layer.

The specific formula for determining the first target sample vector Z1 through the third target sample vector Z3 is as follows:

the preference of the specific role can be effectively modeled by constructing the prior distribution and the posterior distribution of the preference of the specific role. Hidden variables obey a multivariate Gaussian distribution, i.e. q₁(z₁|y，x，c₁)～N(u₁，σ₁ ²I) And p₁(z₁|x，c₁)～N(u′₁，σ′₁ ²I) So its posterior and prior distributions are as follows:

wherein, W_rAnd b_rIs the parameter to be learned.

During the training phase, Z1 is determined using the posterior distribution. In the use phase, Z1 is determined using an a priori distribution.

By modeling prior distribution and posterior distribution of the conversation continuity, the conversation continuity can be effectively improved, and the hidden variables are assumed to obey multivariate Gaussian distribution in a similar way, namely q₂(z₂|y，x，c₂)～N(u₂，σ₂ ²I) And p₂(z₂|x，c₂)～N(u′₂，σ′₂ ²I) The posterior and prior distributions are as follows:

W_cand b_cIs the parameter to be learned. During the training phase, Z2 is determined using the posterior distribution. In the use phase, Z2 is determined using an a priori distribution.

Through the prior distribution and the posterior distribution of the relationship among modeling languages, the problem of inconsistent translation of vocabulary q can be effectively improved₃(z₃|y，x，c₃)～N(u₃，σ₃ ²I) And p is₃(z₃|x，c₃)～N(u′₃，σ′₃ ²I) The posterior and prior distributions are as follows:

W_land b_lIs the parameter to be learned. Where c3 contains two part contexts (source and target). During the training phase, Z3 is determined using the posterior distribution. In the use phase, Z3 is determined using an a priori distribution.

That is, during the training process, the original neural network model determines Z1, Z2 and Z3 (corresponding to Z1 to Z3 outputted from the feature extraction layer 702 of the original neural network model in fig. 7) by the above formula of posterior distribution, and at the same time, uses Target representations passing through the transform Top Block layer and the transform Bottom Block layer as posterior, i.e. Inputs Target Inputs (y) (y < t) in the sample data during the training process into the first coding layer and the second coding layer 706, and determines Z1, Z2 and Z3 of Target Inputs (corresponding to Z1 to Z3 obtained from the first coding layer and the second coding layer 706 in fig. 7) compared with two Z1, two Z2, and two Z3 distances, i.e. KL (| distance in fig. 7), if there is any preset threshold of distance, the original neural network model is not matched, and the original neural network parameters in the original neural network model are not matched, i.e. the above-mentioned parameters W and b to be learned.

In addition, feature aggregation is performed by using a Decoder representation and three pieces of implicit variable information Z1, Z2 and Z3 representation of sampling (a training phase uses Z1, Z2 and Z3 of posterior distribution sampling, and a testing phase uses Z1, Z2 and Z3 of prior distribution sampling), and then decoding is performed by Softmax to generate a proper translation. That is, it is also necessary to decode Z1 through Z3 of the original neural network model output. Such as decoding layer 704 in fig. 7. In the decoding process, firstly, the t-th word y is generated_tFirst, the history part y of the Target content is processed_1:t-1Code learning with a self-attention mechanism:

H^y＝MultiHead(y，y，y)

wherein y is y_1:t-1The word of (2) is embedded in the vector representation. The number of y is determined from the word embedding vector. The encoder representation H is then derived by another self-attention mechanism^xAnd content reply representation H^yThe interaction representation of (1):

O＝FFN(MultiHead(H^y,H^x,H^x))

and then carrying out combined representation on a plurality of hidden variables through splicing operation:

Z＝FFN(z₁,z₂,z₃)

finally, decoding is performed to generate a final reply: p (y)_t|y_1:t-1；x,z₁,z₂,z₃,θ)＝Soft max(W^o[O_t,Z]) Wherein W is^oIs a parameter to be learned.

P is the probability of the predicted word. And taking the word with the highest probability as a translation result of the translated word of the original sentence. And after the translation result of each translated word of the original sentence is determined, obtaining the translation result of the original sentence. . Adjusting the parameter W to be learned by comparing the estimated sentence with the sample translation result^o. And finishing the training of the original neural network model to obtain the target neural network model.

The MultiHead in the decoding process can also be replaced by a single-head self-attention mechanism.

In the use process of the target neural network model, the original sentences and the historical sentences to be translated are obtained, and then the first vector to the fourth vector are obtained. By inputting the first vector to the fourth vector into the target neural network model, the target neural network model can extract the first target vector to the third target vector using the trained feature extraction layer. Then, a word embedding vector of each word before any word currently translated in the original sentence can be obtained, the first target vector, the third target vector and the word embedding vector are spliced into a combined vector, and finally the combined vector is decoded by using a decoding layer trained by a target neural network model to obtain a translation result.

It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.

According to another aspect of the embodiment of the invention, a translation device for implementing the translation method is also provided. As shown in fig. 9, the apparatus includes:

a first obtaining unit 902, configured to obtain, in a session between a first object and a second object, an original sentence, in a first language to be translated, currently input by the first object;

a second obtaining unit 904, configured to obtain a history statement associated with the original statement, where the history statement includes a statement generated by the first object and a statement generated by the second object during a history session between the first object and the second object;

an encoding unit 906, configured to encode the original sentence and the historical sentences to obtain a first target vector, a second target vector, and a third target vector, where the first target vector is used to indicate role preference of the first object, the second target vector is used to indicate an order of the historical sentences, and the third target vector is used to indicate content association between the historical sentences;

a stitching unit 908 for stitching the first target vector, the second target vector and the third target vector into a combined vector;

the decoding unit 910 is configured to decode the combined vector to obtain a translation result of the second language after the original sentence is translated.

For other examples of this embodiment, please refer to the above examples, which are not described herein again.

According to another aspect of the embodiment of the present invention, there is also provided an electronic device for implementing the translation method, where the electronic device may be a terminal device or a server shown in fig. 10. The present embodiment takes the electronic device as an example for explanation. As shown in fig. 10, the electronic device comprises a memory 1002 and a processor 1004, the memory 1002 having stored therein a computer program, the processor 1004 being arranged to execute the steps of any of the method embodiments described above by means of the computer program.

Optionally, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of a computer network.

Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:

in the conversation process of a first object and a second object, acquiring an original sentence, to be translated, of a first language, currently input by the first object;

acquiring a history statement associated with an original statement, wherein the history statement comprises a statement generated by a first object and a statement generated by a second object in the history conversation process between the first object and the second object;

encoding an original statement and a historical statement to obtain a first target vector, a second target vector and a third target vector, wherein the first target vector is used for expressing role preference of a first object, the second target vector is used for expressing the sequence of the historical statement, and the third target vector is used for expressing content association between the historical statements;

splicing the first target vector, the second target vector and the third target vector into a combined vector;

and decoding the combined vector to obtain a translation result of the second language after the translation of the original sentence.

Alternatively, it can be understood by those skilled in the art that the structure shown in fig. 10 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 10 is a diagram illustrating a structure of the electronic device. For example, the electronics may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 10, or have a different configuration than shown in FIG. 10.

The memory 1002 may be used to store software programs and modules, such as program instructions/modules corresponding to the translation method and apparatus in the embodiments of the present invention, and the processor 1004 executes various functional applications and data processing by running the software programs and modules stored in the memory 1002, that is, implementing the translation method described above. The memory 1002 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 1002 may further include memory located remotely from the processor 1004, which may be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 1002 may be specifically, but not limited to, used for storing information such as original sentences, translation results, and historical sentences. As an example, as shown in fig. 10, the memory 1002 may include, but is not limited to, the first obtaining unit 902, the second obtaining unit 904, the encoding unit 906, the splicing unit 908, and the decoding unit 910 in the translating apparatus. In addition, other module units in the translation apparatus may also be included, but are not limited to, and are not described in detail in this example.

Optionally, the above-mentioned transmission device 1006 is used for receiving or sending data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 1006 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices so as to communicate with the internet or a local area Network. In one example, the transmission device 1006 is a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

In addition, the electronic device further includes: a display 1008 for displaying the original sentence and the translation result; and a connection bus 1010 for connecting the respective module parts in the above-described electronic apparatus.

In other embodiments, the terminal device or the server may be a node in a distributed system, where the distributed system may be a blockchain system, and the blockchain system may be a distributed system formed by connecting a plurality of nodes through a network communication. Nodes can form a Peer-To-Peer (P2P, Peer To Peer) network, and any type of computing device, such as a server, a terminal, and other electronic devices, can become a node in the blockchain system by joining the Peer-To-Peer network.

According to a further aspect of an embodiment of the present invention, there is also provided a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.

Alternatively, in the present embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for executing the steps of:

Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be substantially or partially implemented in the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, and including instructions for causing one or more computer devices (which may be personal computers, servers, or network devices) to execute all or part of the steps of the method according to the embodiments of the present invention.

In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. A method of translation, comprising:

acquiring a history statement associated with the original statement, wherein the history statement comprises a statement generated by the first object and a statement generated by the second object in the history conversation process between the first object and the second object;

encoding the original statement and the historical statement to obtain a first target vector, a second target vector and a third target vector, wherein the first target vector is used for expressing role preference of the first object, the second target vector is used for expressing the sequence of the historical statement, and the third target vector is used for expressing content association between the historical statements;

stitching the first target vector, the second target vector, and the third target vector into a combined vector;

and decoding the combined vector to obtain a translation result of the second language translated by the original sentence.

2. The method according to claim 1, wherein the history statements include a first history statement in the first language generated by the first object, a second history statement in the second language generated by the second object, a first translation statement in the first history statement, and a second translation statement in the second history statement, the first translation statement being a statement in the second language, the second translation statement being a statement in the first language, and the encoding the original statement and the history statements to obtain a first target vector, a second target vector, and a third target vector includes:

acquiring a first statement, a second statement, a third statement and a fourth statement, wherein the first statement is the first history statement, the second statement is the first history statement and the second translation statement, the third statement is the history statement, and the fourth statement is the original statement;

obtaining a first vector, a second vector, a third vector and a fourth vector, wherein the first vector is a low latitude vector mapped by the first statement, the second vector is a low latitude vector mapped by the second statement, the third vector is a low latitude vector mapped by the third statement, and the fourth vector is a low latitude vector mapped by the fourth statement;

encoding the first vector as the first target vector, encoding the second vector and the fourth vector as the second target vector, and encoding the third vector as the third target vector.

3. The method of claim 2, wherein encoding the first vector as the first target vector, encoding the second vector and the fourth vector as the second target vector, and encoding the third vector as the third target vector comprises:

encoding the first vector as the first target vector and the third vector as the third target vector using a first encoding layer;

encoding the second vector and the fourth vector into a first intermediate vector and a second intermediate vector using the first encoding layer, encoding the second intermediate vector into a result vector using the second encoding layer, and combining the first intermediate vector and the result vector into the second target vector, wherein the first encoding layer comprises one encoding layer and the second encoding layer comprises five encoding layers.

4. The method of claim 1, wherein the stitching the first target vector, the second target vector, and the third target vector into a combined vector comprises:

and in the case that the original sentence comprises a coded vector, splicing the first target vector, the second target vector, the third target vector and the coded vector into the combined vector, wherein the coded vector is a word embedding vector of each word before any word in the original sentence in the case that any word after the first word in the original sentence is translated.

5. The method of claim 1,

the encoding the original sentence and the historical sentence to obtain a first target vector, a second target vector and a third target vector comprises: inputting the historical statement into a target neural network model, and extracting the first target vector, the second target vector and the third target vector by a feature extraction layer of the target neural network model;

the decoding the combined vector to obtain a translation result of the second language translated by the original sentence comprises: determining a result obtained by decoding the combined vector by using a decoding layer of the target neural network model as the translation result, wherein the target neural network model is a pre-trained model, and the feature extraction layer and the decoding layer comprise target parameters obtained after training original parameters.

6. The method of claim 5, wherein prior to inputting the historical statement into the target neural network model, extracting the first target vector, the second target vector, and the third target vector by a feature extraction layer of the target neural network model, the method further comprises:

acquiring an original neural network model, wherein the original neural network model comprises the feature extraction layer and the decoding layer, and the feature extraction layer and the decoding layer comprise the original parameters;

obtaining a sample original sentence, a sample translation result and a sample historical sentence of the sample original sentence, wherein the sample original sentence is a sentence of the first language generated by a third object, the sample translation result is a sentence of the second language translated by the sample original sentence, the sample historical sentence comprises a first sample historical sentence of the first language generated by the third object in a dialogue process between the third object and a fourth object, a second sample historical sentence of the second language generated by the fourth object, a first sample translation sentence of the first sample historical sentence and a second sample translation sentence of the second sample historical sentence, the first sample translation sentence is a sentence of the second language, and the second sample translation sentence is a sentence of the first language;

and training the original parameters of the original neural network model by using the sample original sentences, the sample translation results and the sample historical sentences to obtain the target neural network model.

7. The method of claim 6, wherein the training the raw parameters of the raw neural network model using the sample raw sentences, the sample translation results, and the sample historical sentences to obtain the target neural network model comprises:

performing the following operations on the original neural network model until the recognition accuracy of the original neural network model is greater than a first threshold value to obtain the target neural network model:

inputting the sample translation result into the feature extraction layer to obtain a first target sample vector, a second target sample vector and a third target sample vector;

inputting the sample original statement into the feature extraction layer to obtain a first original sample vector, a second original sample vector and a third original sample vector;

merging the first original sample vector, the second original sample vector, the third original sample vector and a sample coding vector of the sample original sentence into a sample combination vector, wherein the sample coding vector is a word embedding vector of each word before any word in the sample original sentence under the condition that any word after the first word of the sample original sentence is translated;

decoding the sample combination vector by using the decoding layer to obtain an estimated statement;

adjusting the value of the original parameter in case the difference of the first target sample vector and the first original sample vector is larger than a first threshold, or the difference of the second target sample vector and the second original sample vector is larger than a second threshold, or the difference of the third target sample vector and the third original sample vector is larger than a third threshold, or the difference of the sample original statement and the pre-estimated statement is larger than a fourth threshold.

8. A translation apparatus, comprising:

the first obtaining unit is used for obtaining an original sentence, to be translated, of a first language currently input by a first object in a conversation process of the first object and a second object;

a second obtaining unit, configured to obtain a history statement associated with the original statement, where the history statement includes a statement generated by the first object and a statement generated by the second object during a history session between the first object and the second object;

the encoding unit is configured to encode the original sentence and the historical sentences to obtain a first target vector, a second target vector and a third target vector, where the first target vector is used to indicate role preference of the first object, the second target vector is used to indicate an order of the historical sentences, and the third target vector is used to indicate content association between the historical sentences;

a splicing unit configured to splice the first target vector, the second target vector, and the third target vector into a combined vector;

and the decoding unit is used for decoding the combined vector to obtain a translation result of the second language after the original sentence is translated.

9. The apparatus according to claim 8, wherein the history statements include a first history statement in the first language generated by the first object, a second history statement in the second language generated by the second object, a first translation statement in the first history statement, and a second translation statement in the second history statement, the first translation statement being a statement in the second language, the second translation statement being a statement in the first language, the encoding unit includes:

a first obtaining module, configured to obtain a first sentence, a second sentence, a third sentence, and a fourth sentence, where the first sentence is the first history sentence, the second sentence is the first history sentence and the second translation sentence, the third sentence is the history sentence, and the fourth sentence is the original sentence;

a second obtaining module, configured to obtain a first vector, a second vector, a third vector, and a fourth vector, where the first vector is a low-latitude vector mapped by the first statement, the second vector is a low-latitude vector mapped by the second statement, the third vector is a low-latitude vector mapped by the third statement, and the fourth vector is a low-latitude vector mapped by the fourth statement;

an encoding module to encode the first vector as the first target vector, the second vector and the fourth vector as the second target vector, and the third vector as the third target vector.

10. The apparatus of claim 9, wherein the encoding module comprises:

a first encoding module to encode the first vector as the first target vector and the third vector as the third target vector using a first encoding layer;

a second encoding module for encoding the second vector and the fourth vector into a first intermediate vector and a second intermediate vector using the first encoding layer, encoding the second intermediate vector into a result vector using the second encoding layer, and combining the first intermediate vector and the result vector into the second target vector, wherein the first encoding layer comprises one encoding layer and the second encoding layer comprises five encoding layers.

11. The apparatus of claim 8,

the encoding unit includes: a third encoding module, configured to input the historical statement into a target neural network model, and extract, by a feature extraction layer of the target neural network model, the first target vector, the second target vector, and the third target vector;

the decoding unit includes: and the decoding module is used for determining a result obtained by decoding the combined vector by using a decoding layer of the target neural network model as the translation result, wherein the target neural network model is a pre-trained model, and the feature extraction layer and the decoding layer comprise target parameters obtained after the original parameters are trained.

12. The apparatus of claim 11, further comprising:

a third obtaining unit, configured to obtain an original neural network model before the historical statement is input into the target neural network model and the first target vector, the second target vector, and the third target vector are extracted by a feature extraction layer of the target neural network model, where the original neural network model includes the feature extraction layer and the decoding layer, and the feature extraction layer and the decoding layer include the original parameter;

a fourth obtaining unit configured to obtain a sample original sentence, a sample translation result, and a sample history sentence of the sample original sentence, wherein the sample original sentence is a sentence in the first language generated by a third object, the sample translation result is a sentence in the second language translated from the sample original sentence, the sample history statements comprise a first sample history statement of the first language generated by the third object during the dialog process between the third object and a fourth object, a second sample history statement of the second language generated by the fourth object, a first sample translation statement of the first sample history statement, and a second sample translation statement of the second sample history statement, the first sample translation statement is a statement in the second language, and the second sample translation statement is a statement in the first language;

and the training unit is used for training the original parameters of the original neural network model by using the sample original sentences, the sample translation results and the sample historical sentences to obtain the target neural network model.

13. The apparatus of claim 12, wherein the training unit comprises:

a processing module, configured to perform the following operations on the original neural network model until the recognition accuracy of the original neural network model is greater than a first threshold, so as to obtain the target neural network model:

decoding the sample combination vector by using the decoding layer to obtain an estimated statement; adjusting the value of the original parameter in case the difference of the first target sample vector and the first original sample vector is larger than a first threshold, or the difference of the second target sample vector and the second original sample vector is larger than a second threshold, or the difference of the third target sample vector and the third original sample vector is larger than a third threshold, or the difference of the sample original statement and the pre-estimated statement is larger than a fourth threshold.

14. A computer-readable storage medium, in which a computer program is stored, which computer program, when running, performs the method of any one of claims 1 to 7.

15. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 7 by means of the computer program.