CN111931517B

CN111931517B - Text translation method, device, electronic equipment and storage medium

Info

Publication number: CN111931517B
Application number: CN202010873804.8A
Authority: CN
Inventors: 阿敏巴雅尔; 黄申
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-08-26
Filing date: 2020-08-26
Publication date: 2023-12-12
Anticipated expiration: 2040-08-26
Also published as: CN111931517A

Abstract

The embodiment of the application discloses a text translation method, a text translation device, electronic equipment and a storage medium, which can be applied to the fields of artificial intelligence, big data and the like. The method comprises the following steps: acquiring a text to be translated, wherein the text to be translated is a mixed text of a source language and a target language; generating word vectors of first words in the text to be translated according to the word vector space corresponding to the source language, and generating word vectors of second words in the text to be translated according to the word vector space corresponding to the target language, wherein the first words are words corresponding to the source language, and the second words are words corresponding to the target language; determining coding features corresponding to the text to be translated according to word vectors of words contained in the text to be translated; and generating target text corresponding to the target language of the text to be translated according to the coding characteristics. By adopting the embodiment of the application, the mixed text of the source language and the target language can be accurately translated into the text of the target language, and the applicability is high.

Description

Text translation method, device, electronic equipment and storage medium

Technical Field

The present application relates to the field of artificial intelligence, and in particular, to a text translation method, apparatus, electronic device, and storage medium.

Background

Random artificial intelligence (Artificial Intelligence, AI) and the continuous development of big data, text translation is one of the most important technologies.

Current text translation methods typically translate text in one language into text in another language that has the same semantics. When the translated text is a mixed-language text, that is, the translated text contains multilingual words or phrases, the current text translation method often causes inaccurate semantic expression of the finally obtained translated text due to the presence of multilingual words or phrases, and even has low applicability in the case that partial words in the mixed language cannot be translated.

Therefore, how to improve the accuracy of the translation of the text of the mixed language is a urgent problem to be solved.

Disclosure of Invention

The embodiment of the application provides a text translation method, a device, electronic equipment and a storage medium, which can accurately translate a mixed text of a source language and a target language into a text of the target language and have high applicability.

In a first aspect, an embodiment of the present application provides a text translation method, including:

acquiring a text to be translated, wherein the text to be translated is a mixed text of a source language and a target language;

Generating word vectors of first words in the text to be translated according to the word vector space corresponding to the source language, and generating word vectors of second words in the text to be translated according to the word vector space corresponding to the target language, wherein the first words are words corresponding to the source language, and the second words are words corresponding to the target language;

determining coding features corresponding to the text to be translated according to word vectors of words contained in the text to be translated;

and generating target text corresponding to the target language of the text to be translated according to the coding characteristics.

In a second aspect, an embodiment of the present application provides a text translation apparatus, including:

the acquisition module is used for acquiring a text to be translated, wherein the text to be translated is a mixed text of a source language and a target language;

the generation module is used for generating word vectors of first words in the text to be translated according to the word vector space corresponding to the source language, generating word vectors of second words in the text to be translated according to the word vector space corresponding to the target language, wherein the first words are words corresponding to the source language, and the second words are words corresponding to the target language;

The determining module is used for determining the coding characteristics corresponding to the text to be translated according to the word vectors of the words contained in the text to be translated;

and the translation module is used for generating target text of the text to be translated, which corresponds to the target language, according to the coding characteristics.

In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory, where the processor and the memory are connected to each other;

the memory is used for storing a computer program;

the processor is configured to perform the method provided in the first aspect when the computer program is invoked.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium storing a computer program for execution by a processor to implement the method provided in the first aspect above.

In a fifth aspect, embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the electronic device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the method provided in the first aspect.

In the embodiment of the application, when the text to be translated is a mixed text of a source language and a target language, a mode of sharing a target end word list is used for Words in the target language contained in the text to be translated, namely, word vectors of Words belonging to the target language contained in the text to be translated and word vectors of Words in the target text correspond to the same vector space, so that word vectors of Words belonging to the target language processed by an encoding part and a decoding part are identical, the occurrence of the problem of UNK (Unknown Words) in the target text obtained by translation can be effectively avoided, and the recognition capability of a decoding end on Words belonging to the target language in the text to be translated can be effectively improved, and the translation accuracy of the text to be translated is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a text translation method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a scenario of text translation provided by an embodiment of the present application;

FIG. 3 is a flowchart illustrating a method for generating a target text according to an embodiment of the present application;

FIG. 4 is a schematic diagram of generating target text based on a pointer network according to an embodiment of the present application;

FIG. 5 is another schematic view of a text translation method according to an embodiment of the present application;

FIG. 6 is a flow chart of a model training method provided by an embodiment of the present application;

FIG. 7 is a schematic diagram of a scenario for determining alignment information according to an embodiment of the present application;

FIG. 8 is a schematic view of a scenario for determining a third training text provided by an embodiment of the present application;

FIG. 9 is a schematic diagram of determining a second training set according to an embodiment of the present application;

FIG. 10 is a schematic view of a scenario of model training provided by an embodiment of the present application;

FIG. 11 is a diagram showing comparison of performance of a text translation model according to an embodiment of the present application

Fig. 12 is a schematic structural diagram of a text translation device according to an embodiment of the present application;

fig. 13 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

The text translation method provided by the embodiment of the application can be applied to various fields of artificial intelligence, big data and the like, such as man-machine interaction based on natural language processing (Nature Language processing, NLP), cloud computing in Cloud technology (Cloud technology), artificial intelligence Cloud service and related data computing processing fields in the big data field, and aims to translate the text to be translated into the target language text. The text to be translated is a source language text or a mixed text composed of a source language or a target language, which can be specifically determined based on an actual application scenario without limitation.

Artificial intelligence is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and expand human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

Natural language processing (Nature Language processing, NLP) is an important direction in the fields of computer science and artificial intelligence. It is studying various theories and methods that enable effective communication between a person and a computer in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Thus, the research in this field will involve natural language, i.e. language that people use daily, so it has a close relationship with the research in linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic questions and answers, and the like.

Cloud technology refers to a hosting technology for unifying serial resources such as hardware, software, network and the like in a wide area network or a local area network to realize calculation, storage, processing and sharing of data. The text translation method provided by the embodiment of the application can be realized based on cloud computing (closed computing) in cloud technology.

Cloud Computing refers to obtaining required resources through a network in an on-demand and easily-extensible manner, and is a product of traditional computer and network technology development fusion such as Grid Computing (Grid Computing), distributed Computing (Distributed Computing), parallel Computing (Parallel Computing), utility Computing (Utility Computing), network storage (Network Storage Technologies), virtualization (Virtualization), load balancing (Load Balance) and the like.

Artificial intelligence cloud services, also commonly referred to as AIaaS (AI as a Service), AI is a Service. The AIaaS platform splits several common artificial intelligence services and provides independent or packaged services such as text translation services and the like at the cloud.

Big data (Big data) refers to a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth-rate and diversified information asset which needs a new processing mode to have stronger decision-making ability, insight discovery ability and flow optimization ability. With the advent of the cloud age, big data has attracted more and more attention. Big data is based on the technologies of massive parallel processing databases, data mining, distributed file systems, distributed databases, cloud computing and the like, and the text translation method provided by the embodiment is effectively implemented.

Referring to fig. 1, fig. 1 is a flowchart illustrating a text translation method according to an embodiment of the present application. The method can be executed by any electronic device, such as a server or a user terminal, or the interaction between the user terminal and the server is completed, optionally, the method can be executed by the server, and the user terminal can send the text to be translated to the server, so that the server translates the text to be translated. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server or a server cluster for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligent platforms, and the like. The user terminal may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc., and the user terminal and the server may be directly or indirectly connected through wired or wireless communication, but is not limited thereto.

As shown in fig. 1, the text translation method provided by the embodiment of the application includes the following steps:

step S1, acquiring a text to be translated.

In some possible embodiments, the text to be translated is a mixed text of a source language and a target language, the source language is the meaning of the translated text, and the target language is a translation language corresponding to the translated text. Any text to be translated which needs to be translated into the target language text includes, but is not limited to, text to be translated in the processes of intelligent question-answering, intelligent translation, semantic analysis and the like in the artificial intelligence field, such as text obtained by converting simultaneous translated voice into text, text input into a translation webpage and a translation tool by a user and the like, and the text is not limited herein.

Optionally, the text to be translated may also be a text in a source language, that is, a text in the text to be translated only composed of words corresponding to the source language, and further, based on the text translation method provided in this embodiment, the text to be translated in the source language is translated into a target text corresponding to the target language. That is, the text translation method provided by the embodiment of the application can translate the target text in the source language and the mixed text of the source language and the target language into the target text corresponding to the target language.

Optionally, when the text translation method provided by the embodiment of the present application is executed by the user terminal, the user terminal may obtain the text input by the user as the text to be translated, or convert the user voice into the text to obtain the text to be translated, or the specific obtaining manner may be determined based on the actual application scenario requirement for the text to be translated obtained by the user based on the user terminal from the network, the big data, and the like, which is not limited herein.

Optionally, when the text translation method provided by the embodiment of the present application is executed by the server, the server may obtain the text sent by any user terminal as the text to be translated, or generate the text to be translated based on the generation instruction sent by the user terminal, based on the cloud computing, big data and other technologies, or obtain the text to be translated from the storage space indicated by the acquisition instruction based on the acquisition instruction sent by the user terminal, where the specific acquisition mode may be determined based on the actual application scenario requirement, and is not limited herein. The storage space includes, but is not limited to, a cloud server, a cloud storage space, and the like, which are not limited herein.

Step S2, generating word vectors of first words in the text to be translated according to the word vector space corresponding to the source language, and generating word vectors of second words in the text to be translated according to the word vector space corresponding to the target language.

In some possible embodiments, word segmentation processing needs to be performed on the text to be translated before generating word vectors of words contained in the text to be translated, so as to split the complete text to be translated into independent multiple words. Specifically, word segmentation processing can be performed on the text to be translated by adopting methods such as forward maximum matching, reverse maximum matching, minimum segmentation, bidirectional matching, shortest path matching and the like based on a preset dictionary. Alternatively, the word segmentation process can be performed on the text to be translated by using a statistical word segmentation method, such as a method of establishing a hidden Markov model and an N-gram model. Optionally, word segmentation processing can be performed on the text to be translated based on the semantics of the text to be translated, optionally, word segmentation processing can be performed on the text to be translated based on a word segmentation tool, and a specific word segmentation mode can be configured based on the requirements of actual application scenes, so that the method is not limited.

In general, when words corresponding to a source language and words corresponding to a target language are simultaneously included in a text to be translated, the words corresponding to the target language and the words corresponding to the target text of the text to be translated corresponding to the target language generally represent the same meaning. However, the existing text translation method or model often understands the word in the target language in the text to be translated as the word in the source language, so that inaccurate translation is caused or UNK occurs. On the other hand, the words belonging to the target language (for convenience of description, hereinafter simply referred to as target words) in the text to be translated can cause that the existing text translation model needs to additionally add target words in the word list of the coding end, so that model parameters can be increased, the complexity of the model is increased, relevant training parameters in model training and data processing amount in the translation process are increased, and the text translation efficiency is reduced.

Therefore, in view of the above problems, in the text translation method provided in the embodiment of the present application, when a text to be translated includes a target word corresponding to a target language, a word vector of the target word and a word vector of each word in the target text correspond to the same vector space, so that the target word in the text to be translated can be accurately identified without increasing model parameter training, translation accuracy and efficiency are improved, and occurrence of UNK can be reduced.

Therefore, after the text to be translated is subjected to word segmentation processing to obtain each word contained in the text to be translated, each first word in each word contained in the text to be translated can be determined, and the word vector of each first word is generated according to the word vector space corresponding to the source language. Wherein each first word in each word contained in the text to be translated is a word corresponding to the source language. Meanwhile, a second word in each word contained in the text to be translated can be determined, and a word vector of the second word is generated according to a word vector space corresponding to the target language. Wherein each second word in each word contained in the text to be translated is a word corresponding to the target language.

Specifically, a word vector corresponding to a word (a first word) in a source language and a word vector corresponding to a word (a second word) in a target language may be generated based on a word vector encoding method (e.g., a hot-independent encoding) corresponding to a word vector space corresponding to each language (a word corresponding to the source language and a word corresponding to the target language). Alternatively, word vectors corresponding to words in the source language and word vectors corresponding to words in the target language may be generated based on word2vector models corresponding to word vector spaces corresponding to respective languages. Alternatively, the word vector corresponding to the word in the source language and the word vector corresponding to the word in the target language may be generated through an embedding (embedding) layer corresponding to the word vector space corresponding to each language, and the specific word vector generation method is not limited herein.

It should be specifically noted that the above method for determining the word vector of each word included in the text to be translated is merely an example, and may be specifically determined based on the actual application scenario requirement, which is not limited herein.

That is, when generating the word vector of each word included in the text to be translated, if there is a word corresponding to the target language (i.e., the target word) in the text to be translated, the target word is processed by using a coding manner recognizable by the decoder, so as to obtain the word vector of the target word. I.e. by sharing word embedding, it is possible to make both the encoder and the decoder share one target end word embedding, i.e. the word embedding of words corresponding to the same target language handled by the encoder and the decoder is always the same. The shared word embedding also means that the word vector of the target word at the encoder side can be recognized by the decoder, so that the word vector of all words in the text to be translated can be correctly recognized by the decoder and cannot be recognized as UNK, the possibility of UNK occurrence is greatly reduced, and the generation of high-quality target text is facilitated.

And S3, determining coding features corresponding to the text to be translated according to word vectors of words contained in the text to be translated.

In some possible embodiments, the word vector of each word included in the text to be translated may be input to an encoder, that is, the encoder is used to encode the word vector of each word included in the text to be translated, so as to obtain the encoding feature corresponding to the text to be translated.

The encoder may be implemented by a neural network structure, and the specific network structure of the encoder is not limited, and may be selected and configured according to actual requirements, and the neural network includes, but is not limited to, a cyclic neural network (Recurrent Neural Networks, RNN), a Long Short-Term Memory (LSTM), a gated cyclic unit (Gated Recurrent Unit, GRU), a self-attention (self-attention) based neural network structure, and the like, which may be specifically configured and selected based on actual application scene requirements, and is not limited herein.

And S4, generating target text corresponding to the target language of the text to be translated according to the coding characteristics.

In some possible implementations, after obtaining the encoding features of the text to be translated, the encoding features may be decoded based on a decoder, thereby generating target text of the text to be translated that corresponds to the target language. Each word in the target text is generated based on a word vector of a word before the word and coding features corresponding to the text to be translated.

It will be appreciated that for the first word in the target text, the word may be generated based on the coding feature or based on a vector of coding features and the initiator.

Specifically, when the encoding feature is decoded for the first time based on the decoder, the first word in the target text may be decoded based on the encoding feature and the initial symbol vector. Further, the word vector of the first word is input to the decoder, such that the decoder obtains the word vector of the second word in the target text based on the word vector of the first word and the decoding characteristics. And repeating the above process until decoding is finished, obtaining word vectors of a plurality of words, and further generating target text of the text to be translated, which corresponds to the target language, based on the word vectors of the plurality of words obtained by the decoder.

The decoder may be implemented by a neural network structure, and the specific network structure of the decoder is not limited, and may be selected and configured according to actual requirements, and the neural network includes, but is not limited to, a cyclic neural network (Recurrent Neural Networks, RNN), a Long Short-Term Memory (LSTM), a gated cyclic unit (Gated Recurrent Unit, GRU), a self-attention (self-attention) based neural network structure, and the like, which may be specifically configured and selected based on actual application scene requirements, and is not limited herein.

It will be appreciated that for the word embedding layer at the encoding end and the word embedding layer at the decoding end, the word embedding layer may be a part of the encoder/decoder or a part of the encoder/decoder, and in the embodiment of the present application, the word embedding layer is described as a part independent of the encoder/decoder for convenience of description.

According to the implementation manner, when the coding features are decoded based on the decoder, the decoder can identify word vectors of words belonging to the target language in the text to be translated, so that the situation that the decoder is in error identification or can not decode due to the fact that words of different languages are coded by adopting the same coding mode is avoided, and therefore the semantics of the target text of the target language generated based on the decoder are consistent with the semantics of the text to be translated.

The text translation method provided by the embodiment of the application is described below with reference to fig. 2. FIG. 2 is a schematic diagram of a scenario for text translation according to an embodiment of the present application. As shown in fig. 2, the text to be translated, "i believes you will succeed" is a mixed sentence of english and chinese, which needs to be translated into english (english is a target language), and after word segmentation processing is performed on the text to be translated, words "i", "belief", "you", "will" and "success" of the text to be translated are obtained. After the word vector of each word is determined, the word vector of each word is encoded through encoder encoding, and encoding characteristics corresponding to the text to be translated are obtained. When decoding the encoded feature based on the decoder, a vector corresponding to the start symbol < start > is input to the decoder so that the decoder obtains the first word "I" in the target text based on the vector corresponding to the start symbol < start > and the decoded feature.

Further, a word vector corresponding to "I" is generated based on the word embedding layer at the decoder side, and the word vector corresponding to "I" is input into the decoder, so that the decoder obtains the next word "believe" of "I" based on the word vector corresponding to "I" and the decoding characteristics, and so on, until the decoder outputs an end symbol < eos >, or the decoding length reaches the maximum decoding length of the decoder, the decoding is stopped. At this time, a target text "I believe you will succeed" corresponding to the target language of the text to be translated may be obtained based on the words obtained by the decoder.

In some possible embodiments, to further improve the accuracy of text translation of the text to be translated, a target text of the text to be translated corresponding to the target language may be generated based on the pointer network. Referring specifically to fig. 3, fig. 3 is a flowchart illustrating a method for generating a target text according to an embodiment of the present application. As shown in fig. 3, the method for generating the target text provided by the embodiment of the application may include the following steps:

step S41, for each word to be predicted of the target text corresponding to the target language of the text to be translated, determining decoding characteristics corresponding to the word to be predicted according to the coding characteristics and word vectors of the word to be predicted, which are the word to be predicted and are the word vectors of the word to be predicted and the word to be predicted.

In some possible embodiments, for convenience of description, in a target text corresponding to a target language of a text to be translated, a word that has been translated is referred to as a predicted word, and a word that has not been translated is referred to as a word to be predicted. The decoding characteristics of any word to be predicted in the target text are determined by the decoder based on the hidden state characteristics when the decoder decodes the encoding characteristics based on the word vector of the word to be predicted, which is the word vector of the word to be predicted and is the word vector of the word to be predicted. If the word vector of the last predicted word of a word to be predicted is input into a decoder based on an LSTM structure, the decoder outputs the hidden state characteristics of the word vector of the word to be predicted, and further determines the decoding characteristics of the word to be predicted. The hidden state feature corresponding to the word to be predicted may be specifically determined based on the actual decoder structure, which is not limited herein.

Step S42, determining hidden state features corresponding to the text to be translated according to word vectors of words contained in the text to be translated.

In some possible embodiments, the hidden state features corresponding to the text to be translated are the hidden state features corresponding to the word vector of each word, i.e., the hidden state vector { h }, after the word vectors of the words included in the text to be translated are input into the encoder _1,n ,h _2,n ,…,h _m,n }. Where m represents the sentence length of the text to be translated, i.e. the number of words in the text to be translated, and n represents the number of layers of the neural network used by the encoder.

And S43, determining attention distribution corresponding to the word to be predicted, word distribution corresponding to the word to be predicted, first weight corresponding to the attention distribution and second weight corresponding to the word distribution according to the hidden state features and the decoding features.

In some possible embodiments, to further improve accuracy of text translation, when determining any word to be predicted in the target text, a final word to be predicted may be determined based on the attention distribution corresponding to the word to be predicted and the word distribution corresponding to the attention distribution. The attention distribution corresponding to any word to be predicted represents the attention degree (i.e. the importance degree) of each word in the text to be translated to the word to be predicted in the process of translating to obtain the word to be predicted, and the word distribution corresponding to any word to be predicted is the word probability distribution which is output by the decoder based on the word vector and the coding feature of the word to be predicted of the last word to be predicted and is used for representing that each word in the decoding dictionary is the word to be predicted.

Specifically, for any word to be predicted, based on the decoding feature s corresponding to the word to be predicted _t And hidden state features { h } corresponding to text to be translated _1,n ,h _2,n ,…,h _m,n Performing attention calculation to obtain attention distribution corresponding to the ith word to be predictedWhere t represents the current decoding time. Obviously, after determining one word to be predicted, the word vector of the word to be predicted needs to be input into a decoder to obtain the decoding characteristic corresponding to the next word to be predicted, namely, the word to be predicted is the last word to be predicted in the next word to be predicted corresponding to the last predicted word in the target text. And obtaining the attention distribution corresponding to the next word to be predicted based on the decoding characteristics of the next word to be predicted and the hidden state characteristics corresponding to the text to be translated.

Specifically, due to the attention distribution corresponding to the words to be predictedAnd representing the degree of attention of each word in the text to be translated to the predicted word in the decoding process, thus being based on the attention profile +.>Hidden state feature { h } corresponding to text to be translated _1,n ,h _2,n ,…,h _m,n Determining a context vector c corresponding to the word to be predicted _t . Wherein (1)>Further, based on context vector c corresponding to the word to be predicted _t Decoding feature c _t Determining probability distribution of each word in the decoding dictionary as the word to be predicted, namely word distribution P corresponding to the word to be predicted _predict . Obviously, after each word to be predicted is determined, the word to be predicted is used as a predicted word, and a word vector corresponding to the word to be predicted is required to be input into a decoder to obtain decoding characteristics corresponding to the next word to be predicted, so that word distribution corresponding to the next word to be predicted is obtained based on the decoding characteristics and the context vector of the next word to be predicted.

In some possible embodiments, to further determine the accuracy of the word to be predicted, the attention distribution corresponding to the word to be predicted and the influence degree of the word distribution on the word to be predicted may be determined, so that the word to be predicted is determined in combination with different influence degrees. For convenience of description, the degree of influence of the attention distribution corresponding to the word to be predicted on the word to be predicted is referred to as a first weight, and the degree of influence of the word distribution of the word to be predicted on the word to be predicted is referred to as a second weight. Wherein, the first weight and the second weight can be represented by hidden state features { h } corresponding to the text to be translated _1,n ,h _2,n ,…,h _m,n Decoding features s of the sum word to be predicted _t Determining that the second weight is g _pred ＝σ(c _t W _p +s _t W _q +b _r ) The first weight is 1-g _pred . Wherein W is _p 、W _q B _r The neural network parameters may be obtained through training, and are not limited herein.

And step S45, generating a word to be predicted according to the attention distribution, the word distribution, the first weight and the second weight.

In some possible embodiments, for any word to be predicted, a union of the weighted attention distribution and the weighted word distribution corresponding to the word to be predicted may be determined, and the word with the highest probability in the union may be determined as the word to be predicted at the current time.

In some possible implementations, the prediction is based on the current waitAttention distribution of wordsAnd its first weight (1-g _pred ) And word distribution P of the current word to be predicted _predict And a second weight g thereof _pred The final word probability distribution of the current word to be predicted, namely P= (1-g) can be obtained _pred )*P _encdec-att +g _pred *P _predict . Wherein P is _encdec-att For attention distribution->Corresponds to and P _encdec-att Is the same dimension of the influence profile, attention profile +.>Is 1-g _pred ) Namely the P is _encdec-att And (5) corresponding weight. P (P) _encdec-att For indicating the degree of influence of each word in the text to be translated on the predicted word, e.g. when the text to be translated contains a word belonging to the target language (i.e. the target word) in P _encdec-att Compared with the word distribution value of the source language, the distribution value of the target word is larger, namely the influence degree of the target word on the word to be predicted is higher, and the probability that the target word serving as the word to be predicted appears in the target text is larger. In other words, for words in the source language in the text to be translated, they do not appear in the target text, so words in the source language in the text to be translated are at P _encdec-att The distribution value (which may be 0) of (a) is much smaller than the distribution value of the target word of the target language. Wherein the attention distribution->Corresponding influence profile P _encdec-att Attention distribution +.>Normalization processing is carried out, and the processing result is further processed to be P _predict Obtained after having dimensions of the same word, or based on otherThe attention distribution processing mechanism, the related processing function implementation, is not limited herein. Based on the implementation manner, the word to be predicted in the determination process of each word to be predicted can be determined, and then used as the predicted word in the target text, so that the target text of the text to be translated, which corresponds to the target language, is generated.

The method shown in fig. 3 is further described with reference to fig. 4, and fig. 4 is a schematic diagram of generating a target text based on a pointer network according to an embodiment of the present application. In fig. 4, the text to be translated is "i believe you will be occupied", and the source language is english and the target language is chinese. After word vectors of words in the text to be translated are input into an encoder, the obtained hidden state characteristics corresponding to the text to be translated are { h } _1,n ,h _2,n ,…,h _m,n }. And based on the fact that the decoder predicts the partial words of the text to be translated corresponding to the target text of the target language, namely 'I', 'believe', 'you' and 'will' in fig. 4, the word to be predicted after the word 'will' in the target text needs to be predicted.

Inputting word vectors corresponding to the word "will" into a decoder based on decoding characteristics s of the decoder _t Hidden state feature { h } of text to be translated _1,n ,h _2,n ,…,h _m,n Obtaining the attention distribution corresponding to the word to be predictedAnd is further based on the attention profile +.>Determining a corresponding influence profile P _encdec-att . Attention distribution based on the correspondence of the word to be predicted +.>And hidden state feature { h } _1,n ,h _2,n ,…,h _m,n The context vector c corresponding to the word to be predicted can be determined _t And is based on context vector c _t Decoding feature s _t Determining word distribution P corresponding to words to be predicted _predict . Further, context-based vector c _t Decoding feature s _t Determining word distribution P _predict Corresponding weight g _pred And attention distribution->Corresponding weights 1-g _pred Thereafter, according to the distribution P _predict Weight g corresponding to the weight g _pred Influence force distribution P _encdec-att Corresponding weights 1-g _pred And determining the final word probability distribution P corresponding to the word to be predicted. At this time, the word with the highest probability in the word probability distribution P is the word to be predicted after the word "will" in the target text corresponding to the target language, i.e. "occupied" shown in fig. 4. Thus determining that the text to be translated corresponds to the target text "I believe you will succeed" of the target language based on the previously predicted words ("I", "believe", "you" and "will") and the above-described word to be predicted "buffered".

It should be noted that, if the text to be translated includes a target word belonging to the target language, the word vector of the target word and the word vector of each word in the target text correspond to the same vector space. As shown in fig. 5, fig. 5 is another schematic view of a text translation method according to an embodiment of the present application. In fig. 5, the text to be translated is a mixed text of a source language (Uygur language) and a target language (Chinese), and when determining word vectors of words in the text to be translated, since "university" is different from other words, and when the "university" is processed by adopting a vector generation mode corresponding to the source language, the decoder is likely to process the word vectors of "university" as word vectors of Uygur language, thereby resulting in poor translation effect. Therefore, after the 'university' is processed by adopting the encoding mode which can be identified by the decoder to obtain the corresponding word vector, the word vector corresponding to the 'university' and the word vectors of other Uygur words are input into the decoder (Encoder), and then the influence distribution and the word distribution corresponding to the 'university' when serving as the word to be predicted in the target text are obtained after the relevant processing. That is, the "university" in the text to be translated has a larger influence on the word to be predicted, so that the final word probability distribution thereof can be further obtained based on the corresponding influence distribution and word distribution, wherein the probability of the "university" in the final word probability is the largest, and the word (university) with the largest probability is taken as one word to be predicted in the target text.

In the embodiment of the application, when the text to be translated is a mixed text of a source language and a target language, a mode of sharing a target end word list is used for target words of the target language contained in the text to be translated, namely, word vectors of the target words belonging to the target language contained in the text to be translated and word vectors of the words in the target text correspond to the same vector space, so that word vectors of the target words processed by the coding part and the decoding part are identical, the occurrence of a target text UNK problem obtained by translation can be effectively avoided, and the recognition capability of a decoding end for the target words in the text to be translated can be effectively improved, and the translation accuracy of the text to be translated is improved.

In some possible embodiments, the text translation method may be implemented based on a text translation model, where the training method of the text translation model may be referred to in fig. 6. FIG. 6 is a flow chart of a model training method according to an embodiment of the present application, where the model training method shown in FIG. 6 may include the following steps:

and S5, acquiring a first training set and a second training set.

In some possible implementations, the training data for training the text translation model includes a first training set and a second training set. The first training set comprises a plurality of training texts, each training text pair comprises a first training text and a second training text, the first training text is a source language text, and the second training text is a target language text corresponding to the first training text. Assuming that the source language is Uygur language, the target language is Chinese, the language of the first training text of each training text pair is Uygur language, the language of the second training text is Chinese, and the semantics of the first training text and the semantics of the second training text are the same. That is, for any pair of training texts in the first training set, the second training text is a translated text of the target language of the first training text.

The training text pairs in the first training set may be obtained from a network, a database, or the like, or may be obtained based on a manual structure, which is not limited in this embodiment of the present application. The first training set can be constructed by obtaining a training text set based on the existing translation model, or by determining another corresponding language text after obtaining the source language text or the target language text. It should be noted that, training the initial model based on the first training set may enable the trained model to have the ability to translate the text to be translated in a single language into the text in the target language.

Wherein the second training set also comprises a plurality of training text pairs, each training text pair comprising a third training text and a fourth training text. It should be specifically noted that, for each training text pair in the second training set, the third training text is a mixed text of the source language and the target language, and the fourth training text is a target language text corresponding to the third training text.

Assuming that the source language is Uygur language, the target language is Chinese, speaking each training text in the first training set, the language of the first training text of each training text pair is Uygur language, and the language of the corresponding second training text is Chinese. For each training text pair in the second training set, the language of the third training text of each training text pair is Uygur language and Chinese, the language of the fourth training text is Chinese, and the semantics of the third training text and the semantics of the fourth training text are the same. That is, for any pair of training texts in the second training set, the third training text includes words in the source language and the target language, and the fourth training text is a translated text in the target language of the third training text. It should be noted that training the initial translation model based on the second training set may enable the trained model to have the ability to translate the mixed text of the source language and the target language into the target language text.

In some possible embodiments, the initial training set may be acquired first when the second training set is acquired. The initial training set comprises a plurality of initial text pairs, each initial text pair comprises a first text and a second text, the first text is a source language text, and the second text is a target language text corresponding to the first text. The method for obtaining the initial training set may be the same as the method for obtaining the first training set, or the first training set may be used as the initial training set, or after a part of initial text pairs are obtained in the same manner as the method for obtaining the first training set, the training text pairs in the first training set may be used as initial text pairs of another part, which may be specifically determined based on the actual application scene requirement, and the method is not limited herein.

And for any initial text pair in the initial training set, aligning words and/or phrases in the first text and the second text of the initial text pair based on word alignment tools, phrase tables and other translation models to obtain word alignment information and/or phrase alignment information in the first text and the second text. Among these, the word alignment tools include, but are not limited to Fast Align, GIZA++, and the like, without limitation. The phrase table is the corresponding information of each word in the source language and the target language, such as the corresponding information of each English word and Chinese in the English-Chinese dictionary.

As shown in fig. 7, fig. 7 is a schematic view of a scenario for determining alignment information according to an embodiment of the present application. In fig. 7, when the first text in one initial text pair is "I believes you will succeed" and the second text is "I believe you will succeed", word alignment information of the first text and the second text may be obtained after the words of the first text and the second text are aligned based on the above manner, that is, "I", "belief", "you", "will", "succeed" in the first text shown in fig. 7 correspond to "I", "believe", "you", "will", "success" in the second text, respectively. When the first text in the other initial text pair is "He encountered trouble", and the second text is "He is in double", after the words of the first text and the second text are aligned based on the above manner, word alignment information and phrase alignment information of the first text and the second text, that is, "He" in the first text shown in fig. 7 "He" and "trouble encountered trouble" respectively correspond to "He" in the second text "are obtained.

Further, according to word alignment information and/or phrase alignment information in the first text and the second text of the initial text pair, at least one word and/or phrase in the second text is adopted to replace at least one word and/or phrase to be replaced in the corresponding first text, a replaced text is obtained, at this time, the replaced text can be determined to be a third training text in one training text pair in the second training set, and the second text corresponding to the replaced text is taken as a fourth training text corresponding to the third training text.

Specifically, text analysis can be performed on the first text and the second text to obtain text analysis results, and then word alignment information and/or phrase alignment information are combined to determine words to be replaced and/or phrases to be replaced in the first text. The text analysis results are used to describe sentence components, such as pronouns, nouns, verbs, and the like, of each word and/or phrase in the first text and the second text, and may be specifically determined by means of sequence labeling, syntax tree, word replacement tool, and the like, or may be determined by means of random determination, and the like, which is not limited herein. Based on the text analysis, determining nonsensical words and/or phrases, such as, for example, mood words, nonsensical words, in the first text as words to be replaced and/or phrases to be replaced may be avoided.

Further, at least one word and/or phrase corresponding to the word and/or phrase to be replaced in the second text is adopted to replace the word and/or phrase to be replaced in the first text. The purpose of replacing the word to be replaced and/or the phrase to be replaced in the corresponding first text with at least one word and/or phrase in the second text is to construct a text including both the word and/or phrase in the source language and the word and/or phrase in the target language, so that the number of the replaced words and/or phrases to be replaced can be determined based on the actual application scenario, and is not limited herein. Based on the implementation manner, a plurality of training text pairs comprising the third training text and the fourth training text can be constructed, and a second training set for training the initial translation model is further constructed.

As shown in fig. 8, fig. 8 is a schematic view of a scenario for determining a third training text according to an embodiment of the present application. As shown in fig. 8, the first text in the initial text pair is "i believe you will succeed" and the corresponding second text is "I believe you will succeed". Based on the word alignment information and the text analysis result, it can be determined that the words to be replaced in the first text are "believed" and "successful", respectively. And further, replacing the ' belief ' by the corresponding ' believe ' in the second text, and obtaining a replaced training text ' I believe you can replace ' successful ' after replacing by ' successful '. The text may then be used as the third training text in one of the training text pairs in the second training set, and the corresponding second text "I believe you will succeed" may be used as the fourth training text corresponding to the third training text in fig. 8.

In some possible embodiments, to ensure the training effect of the second training set on the initial translation model, after the initial training set is acquired, filtering and screening are performed on each initial text in the initial training set based on preset conditions, so as to obtain the second training set with complete semantics and richer semantics.

Specifically, the text length of the first text and/or the second text of each text pair in the initial training set can be determined, and the text pair with any text length smaller than the set length in the first text and the second text can be filtered. The text pairs are filtered through the text length, and the text pairs containing the first text or the second text with single semantic meaning and incomplete semantic expression can be removed.

Optionally, when the word alignment information and/or phrase alignment information of any text pair is alignment information of a specific character, it is indicated that the first text and/or the second text in the text pair does not have any semantics, and the text pair may be rejected at this time, so as to avoid obtaining meaningless training text pairs. Wherein the specific characters include, but are not limited to punctuation marks, character strings, and other symbols, etc., and are not limited thereto.

Optionally, when the word alignment information and/or phrase alignment information in any text pair includes one-to-many or many-to-one word alignment information or phrase information, it is indicated that a certain word and/or phrase in a first text in the text pair corresponds to a plurality of words and/or phrases in a second text, or that a certain word and/or phrase in the second text corresponds to a plurality of words and/or phrases in the first text, such text pair may be rejected in order to avoid ambiguity in the translation process.

Alternatively, after determining the word alignment information and/or phrase alignment information of one initial text pair, the word alignment information and/or phrase information of each word and/or phrase in the first text of the initial text pair may be compared with the word alignment information and/or phrase alignment information of the previous initial text pair. If the word alignment information of a word and/or phrase information of a phrase in the current initial text is inconsistent with the word alignment information and/or phrase information of all previous initial text pairs, the word alignment information and/or phrase alignment information of the current initial text pair is indicated to have errors, and the current initial text pair can be removed.

If the word "A" of the first text in the current initial text pair corresponds to the word "B" in the second text, but the words "A" of the first text in other initial text pairs correspond to the word "B", the current initial text pair can be directly eliminated.

It should be noted that, to ensure that the training effect of training the initial translation model based on the first training set and the training effect of training the initial translation model based on the second training set are balanced, so that the finally obtained text translation model translates the text in the mixed language into the text in the target language, and the accuracy of translating the text in the source language into the text in the target language is balanced, the number of training text pairs in the second training set may be a certain proportion, such as ten percent, twenty percent, etc., of the number of training text pairs in the first training set, which may be specifically determined based on the actual application scenario, and is not limited herein.

In some possible embodiments, when the second training set is acquired based on the initial training set, word alignment information and/or phrase alignment information in the first text and the second text in the initial text pair may be determined first, and then, after word replacement is performed based on the text analysis result, the replaced text may be filtered. The method can be specifically described with reference to fig. 9, and fig. 9 is a schematic diagram of determining a second training set according to an embodiment of the present application. As shown in fig. 9, for each initial text pair in the initial training set, word alignment information and/or phrase alignment information of a first text and a second text of the initial training set are determined first, text analysis is performed on the first text and the second text at the same time, and after replacement of a replacement word and/or phrase to be replaced in the first text is performed based on the text analysis result and the word alignment information and/or phrase alignment information, whether the replacement is reasonable or not can be determined based on the preset condition. That is, when the text length of the text after the replacement is smaller than the predetermined length, or the word alignment information and/or phrase alignment information at the time of the replacement is alignment information of a specific character, or the word alignment information and/or phrase alignment information at the time of the replacement includes one-to-many or many-to-one word alignment information or phrase alignment information, it is determined that the replacement is unreasonable (i.e., N corresponding to the "replacement-reasonable" determination step in fig. 9 indicates that the replacement is unreasonable). In the event that the corresponding replacement process for any initial text pair is not reasonable, that initial text pair may be discarded and the next text pair replaced.

When the text length of the text after replacement is not less than the predetermined length, the word alignment information and/or phrase alignment information during replacement is not alignment information of specific characters, and the word alignment information and/or phrase alignment information during replacement does not include one-to-many or many-to-one word alignment information or phrase alignment information, the replacement is determined to be reasonable (i.e., Y corresponding to the "replacement reasonable" judging step in fig. 9 indicates that the replacement is reasonable). After the training text pairs in the second training set are obtained based on the above-described substitution process, it is necessary to determine whether the substitution is ended, i.e., whether to stop the above-described substitution process to obtain the training text pairs in the second training set. When it is determined that the replacement is ended (Y corresponding to the "replacement end" judgment step in fig. 9, indicating the replacement is ended), the above-described replacement process is terminated to obtain a second training set. When it is determined that the replacement is not completed (N corresponding to the "replacement completed" determination step in fig. 9 indicates that the replacement is not completed), the above-described replacement process is continuously repeated to continuously obtain a new training text pair until the replacement is completed.

Wherein, when the training text pairs obtained based on the replacing process reach the preset proportion of the number of training text pairs in the first training set, or no first text exists to be replaced words and/or to be replaced phrases, the replacing process can be determined to be finished, otherwise, the replacing process is continued.

And S6, training the initial translation model according to the first training set and the second training set until the training loss of the model meets the preset training ending condition, and determining the model when training is stopped as a text translation model.

In some possible implementations, when training the initial translation model based on the first training set and the second training set, the input of the initial translation model is each first training text in the first training set and each third training text in the second training set, the initial translation model outputs a predicted text for each first training text corresponding to the target language, and each third training text corresponds to the predicted text of the target language. Wherein, each time a first training text or a third training text is input into the initial translation model to train the initial translation model, a training loss of the initial translation model can be determined, the training loss characterizes a difference between the first training text and a corresponding predicted text thereof, and a difference between the third training text and a corresponding predicted text thereof. And continuously adjusting model parameters of the initial translation model according to each training loss so as to continuously improve the translation stability and accuracy of the initial translation model, and determining the model when training is stopped as a final text translation model when the training loss of the model meets a preset training ending condition. The training ending condition may be that the training loss reaches a convergence state, or that a value of the convergence loss is lower than a preset threshold, or that a difference between training losses of consecutive set training times is smaller than a set value (i.e., the training loss is small and a loss difference of consecutive multiple training is small and tends to be stable), which is not limited herein.

The initial translation model may be implemented by combining a neural network structure with a pointer network, and the neural network includes, but is not limited to, a cyclic neural network (Recurrent Neural Networks, RNN), a Long Short-Term Memory (LSTM), a gated cyclic unit (Gated Recurrent Unit, GRU), and a self-attention (self-attention) based neural network structure, which may be specifically configured and selected based on actual application scenario requirements, without limitation.

The model training method provided by the embodiment of the application is further described with reference to fig. 10. Fig. 10 is a schematic view of a model training scenario provided by an embodiment of the present application. Taking an initial text pair in the initial training set as an example, the initial text pair includes the second text "I prefer the optical head inside the animation strong-! "and its corresponding language is Uygur language first text. Word alignment information and/or phrase alignment information of the first text and the second text may be determined by means of word alignment, phrase table, etc., text analysis results of the first text and the second text may be determined based on means of sequence labeling, syntax tree, etc., so that words to be replaced and/or phrases to be replaced in the first text may be determined (alternatively, the number of words to be replaced and/or phrases to be replaced in each first text may be set, e.g., a maximum number of substitutions may be set, or a replacement number of substitutions) and the words to be replaced and/or phrases to be replaced in the first text may be replaced based on the word alignment information and/or phrase alignment information of the first text and the second text and the corresponding text analysis results. Or the word to be replaced and/or the phrase to be replaced in the first text is replaced in a Random replacement mode (Random) in fig. 10 to obtain different text pairs. The resulting text pairs may further be input into an initial translation model (which may be a transducer-based initial translation model as shown in fig. 10) as training text pairs in a second training set to train the initial translation model to obtain a final text translation model.

In some possible embodiments, to further ensure that the text translation model obtained based on the training of the first training set and the second training set has stable translation capability, the testable set tests the obtained text translation model after the training is finished. And stopping model training if the test result meets the preset test condition, otherwise, continuing model training based on the first training set and the second training set until a final text translation model is obtained.

Specifically, the test training set comprises a plurality of groups of test texts, each group of test texts comprises a plurality of test text pairs, each test text pair comprises a first test text and a plurality of second test texts, the first test text is a source language text or a mixed text of a source language and a target language, and the second test text is a target language text corresponding to the first test text. That is, the first test text may correspond to a plurality of target language texts at the same time, i.e., a plurality of expressions in a conventional sense. Moreover, the test text in the test training set may be obtained based on the same acquisition manner of the first training set and the second training set, which is not described herein.

When the text translation model is tested based on the test texts, the first test texts in each group of test texts can be input into the text translation model to obtain the predicted text of each first test text corresponding to the target language, and then the test value corresponding to each group of test texts is determined based on each second test text and the corresponding predicted text. For the test value corresponding to each group of test texts, the text similarity of each second test text in each test text pair and the corresponding predicted text can be determined, and then the text similarity average value of each test text pair in the group of test texts is determined, so that the text similarity average value is determined to be the test value corresponding to the group of test texts, and the higher the test value is, the higher the similarity between the predicted text and the second test text is, and the better the model translation effect is.

The text similarity between the second test text and the corresponding predicted text may be calculated based on an edit distance calculation method, a jekcard coefficient calculation method, a TF calculation method, a TFIDF calculation method, or the like, or a BLEU (Bilingual Evaluation Understudy, bilingual evaluation candidate) value corresponding to the test text may be calculated as the text similarity based on the second test text and the corresponding predicted text, and a specific calculation manner of the text similarity may be determined based on actual application scene requirements, which is not limited herein.

Further, after determining the test value corresponding to each group of test text, if the test value corresponding to the number of continuous preset groups is not improved, the translation performance of the representation model tends to be stable, and at the moment, it can be determined that the text translation model obtained based on the training of the first training set and the second training set accords with the preset test condition. Similarly, the test value corresponding to the continuous preset group number is larger in fluctuation, which indicates that the translation capability of the text translation model is not stable, and the model can be continuously trained based on the first training set and the second training set until a final text translation model is obtained.

In some possible embodiments, in combination with the training manner of the text translation model, in the process of training the initial translation model based on the first training set and the second training set, the initial translation model can be tested based on the test set in real time, so that training of the model is stopped when a certain condition is achieved, and a final text translation model is obtained. Specifically, in the model training process, the test value corresponding to each group of test texts is determined in real time based on the initial translation model, and when the test value corresponding to the test texts with the continuous preset groups is not lifted, the translation capability of the initial translation model tends to be stable. At this time, the initial translation model corresponding to the maximum test value may be determined as the final text translation model.

Referring to table 1, table 1 is a partial example of testing a text translation model based on test data. After training the text translation model with Vietnam as source language and Chinese as target language, the first test text in a test text pair is the mixed text of Vietnam and Chinese "th i ch apple ", and inputting the th i ch apple into the text translation model to obtain a corresponding predictive text" i like eating apple ". After training the Uygur language as a source language and the Chinese language as a target language to obtain a text translation model,the first test text in a test text pair is a mixed text table of Vietnam and Chinese "input it into the text translation model, get the corresponding predicted text" what the top corner of the table writes.

Table 1 test examples

Based on the model training method provided by the embodiment of the application, a text translation model with high translation accuracy can be obtained through training. As shown in table 2, method 1 is a conventional method for translating a source language into a model training of a target language, method 2 is a method for performing model training based on the processed training data after processing the training data by adopting a labeling method (Tentrans-tag), method 3 is a method for performing model training based on the converted training data after performing word conversion on the training data, and method 4 is the model training method provided by the application. As can be seen from table 2, the model training method provided by the embodiment of the present application has the highest BLEU value, which indicates that the text translation model trained by the model training method provided by the embodiment of the present application has the highest accuracy. The recall rate represents the accuracy of the prediction sample output by the model, and the text translation model trained based on the method provided by the embodiment of the application has higher accuracy compared with the model trained by other training methods.

Table 2 comparison of the Performance of text translation models obtained by different training methods

Method	BLEU	Recall rate of recall
			Method 1	31.25	2.09％
Method 2	48.10	99.16％
			Method 3	47.08	74.40％
The method	50.87	93.52％

As shown in Table 3, the text translation model trained by the model training method provided by the embodiment of the application has higher BLEU value and RECALL Rate (RECALL) compared with the existing text translation product (tool), which indicates that the text translation accuracy of the text translation model in the embodiment of the application is higher than that of the existing product.

TABLE 3 comparison of text translation model to Performance of existing products in embodiments of the application

Product(s)	BLEU	Recall rate of recall
			Text translation model	50.87	93.52％
Product 1	45.52	63.01％
			Product 2	31.59	18.60％
Product 3	20.75	20.38％

Further, referring to fig. 11, fig. 11 is a schematic diagram illustrating performance comparison of a text translation model according to an embodiment of the present application. Wherein fig. 11 is based on tables 1 and 2. As can be seen from fig. 11, the training effect of the model training method provided by the embodiment of the application is obviously better than that of other existing training methods, and the translation accuracy of the text translation model obtained by training based on the model training method provided by the embodiment of the application is obviously better than that of the existing text translation product.

In the embodiment of the application, when the text to be translated is a mixed text of a source language and a target language, a mode of sharing a target end word list is used for target words of the target language contained in the text to be translated, namely, word vectors of the target words belonging to the target language contained in the text to be translated and word vectors of the words in the target text correspond to the same vector space, so that word vectors of the target words processed by the encoding part and the decoding part are identical, occurrence of a target text UNK obtained by translation can be effectively avoided, and recognition capability of a decoding end for the target words in the text to be translated can be effectively improved. On the other hand, the output of target words belonging to target language in the text to be translated by the decoding end can be promoted through the pointer network, so that the translation accuracy of the text to be translated is improved.

Referring to fig. 12, fig. 12 is a schematic structural diagram of a text translation device according to an embodiment of the present application. The training device 1 provided by the embodiment of the application comprises:

an obtaining module 11, configured to obtain a text to be translated, where the text to be translated is a mixed text of a source language and a target language;

a generating module 12, configured to generate a word vector of each first word in the text to be translated according to the word vector space corresponding to the source language, and generate a word vector of each second word in the text to be translated according to the word vector space corresponding to the target language, where the first word is a word corresponding to the source language, and the second word is a word corresponding to the target language;

the determining module 13 is configured to determine coding features corresponding to the text to be translated according to word vectors of words included in the text to be translated;

and the translation module 14 is used for generating target text corresponding to the target language of the text to be translated according to the coding characteristics.

In some possible embodiments, the generating module 11 is configured to:

word segmentation processing is carried out on the text to be translated, so that words contained in the text to be translated are obtained;

determining each first word in each word contained in the text to be translated, and generating a word vector of each first word according to a word vector space corresponding to the source language;

And determining each second word in each word contained in the text to be translated, and generating a word vector of each second word according to a word vector space corresponding to the target language.

In some possible embodiments, the translation module 14 is configured to:

for each word to be predicted of the target text of the target language corresponding to the text to be translated, determining decoding characteristics corresponding to the word to be predicted according to the coding characteristics and word vectors of the word to be predicted and the word vectors of the word to be predicted, which are the word to be predicted and are the word vectors of the word to be predicted and the word vectors of the word to be predicted;

and generating the word to be predicted according to the decoding characteristics.

In some possible embodiments, the translation module 14 is configured to:

determining hidden state features corresponding to the text to be translated according to word vectors of words contained in the text to be translated; determining attention distribution corresponding to the word to be predicted, word distribution corresponding to the word to be predicted, first weight corresponding to the attention distribution and second weight corresponding to the word distribution according to the hidden state feature and the decoding feature;

and generating the word to be predicted according to the attention distribution, the word distribution, the first weight and the second weight.

In some possible embodiments, the translation module 14 is configured to:

according to the hidden state characteristics and the decoding characteristics, attention distribution corresponding to the words to be predicted is carried out;

determining a context vector corresponding to the word to be predicted according to the hidden state characteristics and the attention distribution;

and determining word distribution corresponding to the word to be predicted, first weight corresponding to the attention distribution and second weight corresponding to the word distribution according to the context vector and the decoding characteristics.

In some possible embodiments, generating a word vector of each word included in the text to be translated, determining a coding feature corresponding to the text to be translated according to the word vector of each word included in the text to be translated, and generating a target text corresponding to the target language of the text to be translated according to the coding feature is implemented through a text translation model;

the text translation model is obtained through training of a translation device:

the training device is used for:

acquiring a first training set and a second training set;

each training text pair in the first training set comprises a first training text and a second training text, wherein the first training text is a source language text, and the second training text is a target language text corresponding to the first training text;

Each training text pair in the second training set comprises a third training text and a fourth training text, wherein the third training text is a mixed text of the source language and the target language, and the fourth training text is a target language text corresponding to the third training text;

training an initial translation model according to the first training set and the second training set until training loss of the model meets a preset training ending condition, and determining the model when training is stopped as the text translation model;

wherein the input of the initial translation model is the first training text and the third training text, and the output of the initial translation model is the predicted text of the target language corresponding to the input training texts;

the training loss characterizes a difference between each training text of the input and the corresponding predicted text.

In some possible embodiments, the training device is configured to:

acquiring the second training set includes:

acquiring an initial training set, wherein each initial text pair in the initial training set comprises a first text and a second text, the first text is a source language text, and the second text is a target language text corresponding to the first text;

For each initial text pair in the initial training set, determining word alignment information and/or phrase alignment information in a first text and a second text of the initial text pair;

and according to the word alignment information and/or phrase alignment information, replacing words and/or phrases in the corresponding first text with at least one word and/or phrase in the second text of the initial text pair to obtain the third training text, and taking the second text as the fourth training text.

In some possible embodiments, the training device is configured to:

performing text analysis on the first text and the second text to obtain text analysis results, wherein the text analysis comprises syntactic analysis and/or sequence labeling;

determining a word to be replaced and/or a phrase to be replaced in the first text according to the word alignment information and/or phrase alignment information and the text analysis result;

and replacing the word to be replaced and/or the phrase to be replaced in the first text by at least one word and/or phrase corresponding to the word to be replaced and/or the phrase to be replaced in the second text.

In some possible embodiments, the training device is further configured to:

Filtering each initial text pair in the initial training set according to preset conditions;

wherein the preset conditions include at least one of the following:

the text length of the first text and/or the text length of the second text of the text pair is smaller than the set length;

word alignment information and/or phrase alignment information of a text pair is alignment information of a specific character;

word alignment information and/or phrase alignment information for a text pair includes one-to-many or many-to-one word alignment information or phrase alignment information.

In some possible embodiments, the initial translation model is a pointer network based translation model.

In a specific implementation, the text translation device 1 may execute, through each functional module built in the text translation device, an implementation manner provided by each step in fig. 1, fig. 3, and/or fig. 6, and specifically, the implementation manner provided by each step may be referred to, which is not described herein again.

In the embodiment of the application, the word vector of the target word belonging to the target language in the text to be translated and the word vector of the word in the target text are corresponding to the same vector space, so that the recognition rate of the word vector of each word in the text to be translated can be improved, and the translation accuracy of the text to be translated can be improved. Meanwhile, based on the text translation model trained by the pointer network, when a target word belonging to a target language appears in the text to be translated, the target word can be accurately identified and output, so that the translation accuracy of the mixed language text is further improved.

Referring to fig. 13, fig. 13 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 13, the electronic device 1000 in the present embodiment may include: processor 1001, network interface 1004, and memory 1005, and in addition, the electronic device 1000 may further include: a user interface 1003, and at least one communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display (Display), a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface, among others. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1004 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 1005 may also optionally be at least one storage device located remotely from the processor 1001. As shown in fig. 13, an operating system, a network communication module, a user interface module, and a device control application program may be included in the memory 1005, which is one type of computer-readable storage medium.

In the electronic device 1000 shown in fig. 13, the network interface 1004 may provide a network communication function; while user interface 1003 is primarily used as an interface for providing input to a user; and the processor 1001 may be used to invoke a device control application stored in the memory 1005 to implement:

In some possible embodiments, the processor 1001 is configured to:

determining hidden state features corresponding to the text to be translated according to word vectors of words contained in the text to be translated;

determining attention distribution corresponding to the predicted word, word distribution corresponding to the word to be predicted, first weight corresponding to the attention distribution and second weight corresponding to the word distribution according to the hidden state feature and the decoding feature;

In some possible embodiments, the processor 1001 is configured to:

the processor 1001 is configured to:

acquiring a first training set and a second training set;

The processor 1001 is configured to:

In some possible embodiments, the processor 1001 is configured to:

In some possible embodiments, the processor 1001 is further configured to:

wherein the preset conditions include at least one of the following:

It should be appreciated that in some possible embodiments, the processor 1001 may be a central processing unit (central processing unit, CPU), which may also be other general purpose processors, digital signal processors (digital signal processor, DSP), application specific integrated circuits (application specific integrated circuit, ASIC), off-the-shelf programmable gate arrays (field-programmable gate array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The memory may include read only memory and random access memory and provide instructions and data to the processor. A portion of the memory may also include non-volatile random access memory. For example, the memory may also store information of the device type.

In a specific implementation, the electronic device 1000 may execute, through each functional module built in the electronic device, an implementation manner provided by each step in fig. 1, fig. 3, and/or fig. 6, and the specific implementation manner provided by each step may be referred to, which is not described herein again.

The embodiment of the present application further provides a computer readable storage medium, where a computer program is stored and executed by a processor to implement the method provided by each step in fig. 1, fig. 3, and/or fig. 6, and specifically refer to the implementation manner provided by each step, which is not described herein again.

The computer readable storage medium may be an internal storage unit of the task processing device provided in any one of the foregoing embodiments, for example, a hard disk or a memory of an electronic device. The computer readable storage medium may also be an external storage device of the electronic device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) card, a flash card (flash card) or the like, which are provided on the electronic device. The computer readable storage medium may also include a magnetic disk, an optical disk, a read-only memory (ROM), a random access memory (random access memory, RAM), or the like. Further, the computer-readable storage medium may also include both an internal storage unit and an external storage device of the electronic device. The computer-readable storage medium is used to store the computer program and other programs and data required by the electronic device. The computer-readable storage medium may also be used to temporarily store data that has been output or is to be output.

Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions are read from the computer-readable storage medium by a processor of the electronic device, and executed by the processor, cause the computer device to perform the methods provided by the steps of fig. 1, 3, and/or 6.

The terms first, second and the like in the claims and in the description and drawings are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or electronic device that comprises a list of steps or elements is not limited to the list of steps or elements but may, alternatively, include other steps or elements not listed or inherent to such process, method, article, or electronic device. Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments. The term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The foregoing disclosure is illustrative of the present application and is not to be construed as limiting the scope of the application, which is defined by the appended claims.

Claims

1. A method of text translation, comprising:

Determining coding features and hidden state features corresponding to the text to be translated according to word vectors of words contained in the text to be translated;

for each word to be predicted of the text to be translated, determining decoding characteristics corresponding to the word to be predicted according to the coding characteristics and word vectors of the word to be predicted, which is a word to be predicted and is a word to be predicted before the word to be predicted;

determining attention distribution corresponding to the word to be predicted, word distribution corresponding to the word to be predicted, first weight corresponding to the attention distribution and second weight corresponding to the word distribution according to the hidden state features and the decoding features;

determining influence distribution for representing influence degree of each word in the text to be translated on the word to be predicted according to the attention distribution;

determining word probability distribution of the word to be predicted according to the influence distribution, the word distribution, the first weight and the second weight;

and generating the word to be predicted according to the word probability distribution.

2. The method of claim 1, wherein the generating a word vector of each first word in the text to be translated according to the word vector space corresponding to the source language, and generating a word vector of each second word in the text to be translated according to the word vector space corresponding to the target language, comprises:

3. The method of claim 1, wherein determining the attention profile corresponding to the word to be predicted, the word profile corresponding to the word to be predicted, the first weight corresponding to the attention profile, and the second weight corresponding to the word profile based on the hidden state feature and the decoding feature comprises:

determining the attention distribution corresponding to the word to be predicted according to the hidden state features and the decoding features;

determining a context vector corresponding to the word to be predicted according to the hidden state features and the attention distribution;

and determining word distribution corresponding to the word to be predicted, first weight corresponding to the attention distribution and second weight corresponding to the word distribution according to the context vector and the decoding characteristic.

4. The method according to claim 1, wherein generating a word vector of each word included in the text to be translated, determining a coding feature corresponding to the text to be translated according to the word vector of each word included in the text to be translated, and generating a target text corresponding to the target language of the text to be translated according to the coding feature are implemented through a text translation model;

wherein the text translation model is trained by:

acquiring a first training set and a second training set;

The input of the initial translation model is the first training texts and the third training texts, and the output of the initial translation model is the predicted text of each input training text corresponding to the target language;

5. The method of claim 4, wherein obtaining the second training set comprises:

6. The method of claim 5, wherein replacing words and/or phrases in the corresponding first text with at least one word and/or phrase in the second text of the initial text pair according to the word alignment information and/or phrase alignment information, comprises:

determining a word to be replaced and/or a phrase to be replaced in the first text according to the word alignment information and/or the phrase alignment information and the text analysis result;

and replacing the word to be replaced and/or the phrase to be replaced in the first text by adopting at least one word and/or phrase corresponding to the word to be replaced and/or the phrase to be replaced in the second text.

7. The method of claim 5, wherein the method further comprises:

filtering each initial text pair in the initial training set according to a preset condition;

wherein the preset conditions include at least one of:

8. The method according to any of claims 4 to 7, wherein the initial translation model is a pointer network based translation model.

9. A text translation device, the text translation device comprising:

the system comprises an acquisition module, a translation module and a translation module, wherein the acquisition module is used for acquiring a text to be translated, and the text to be translated is a mixed text of a source language and a target language;

the determining module is used for determining coding features and hidden state features corresponding to the text to be translated according to word vectors of words contained in the text to be translated;

A translation module for:

10. The text translation device of claim 9, wherein the generation module is configured to:

11. An electronic device comprising a processor and a memory, the processor and the memory being interconnected;

the memory is used for storing a computer program;

the processor is configured to perform the method of any of claims 1 to 8 when the computer program is invoked.

12. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program, which is executed by a processor to implement the method of any one of claims 1 to 8.