CN111553174A

CN111553174A - Sentence translation method and device based on artificial intelligence

Info

Publication number: CN111553174A
Application number: CN202010256030.4A
Authority: CN
Inventors: 王星; 焦文祥; 涂兆鹏
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-04-02
Filing date: 2020-04-02
Publication date: 2020-08-18

Abstract

The application provides a sentence translation method and device based on artificial intelligence. Inputting a source sentence to be translated into a translation model to obtain a target sentence corresponding to the translated source sentence; acquiring a first feature vector corresponding to a source sentence based on a translation model, and acquiring a second feature vector corresponding to a target sentence; calculating an alignment degree between the source sentence and the target sentence based on the first feature vector and the second feature vector; and determining whether the target sentence is used as a translation result or not based on the alignment degree, so that the obtained translation result is more accurate.

Description

Sentence translation method and device based on artificial intelligence

Technical Field

The present application relates to the field of computer and communication technologies, and in particular, to a sentence translation method and apparatus based on artificial intelligence.

Background

With the continuous development of artificial intelligence technology in various fields, machine translation functions in artificial intelligence have been used more and more widely.

When a sentence is translated by the existing translation method, the most basic bilingual word pair is usually memorized firstly, then words in the bilingual word pair are replaced and trained, the learning process is only carried out through the existing experience knowledge and the translation is carried out through the analogy principle, and the translation result is inaccurate.

Disclosure of Invention

The application aims to provide a sentence translation method and device based on artificial intelligence, and the translation accuracy can be at least improved.

According to an aspect of an embodiment of the present application, there is provided a sentence translation method based on artificial intelligence, including: inputting a source sentence to be translated into a translation model to obtain a target sentence corresponding to the source sentence after the source sentence is translated; acquiring a first feature vector corresponding to the source sentence based on the translation model, and acquiring a second feature vector corresponding to the target sentence; calculating an alignment degree between the source sentence and a target sentence based on the first feature vector and the second feature vector; determining whether to take the target sentence as a translation result based on the alignment degree.

According to an aspect of an embodiment of the present application, there is provided an artificial intelligence-based sentence translation apparatus including: the translation module is used for inputting a source sentence to be translated into a translation model to obtain a target sentence corresponding to the translated source sentence; an obtaining module, configured to obtain a first feature vector corresponding to the source sentence based on the translation model, and obtain a second feature vector corresponding to the target sentence; a calculation module, configured to calculate an alignment degree between the source sentence and a target sentence based on the first feature vector and the second feature vector; and the determining module is used for determining whether the target sentence is used as a translation result or not based on the alignment degree.

In some embodiments of the present application, based on the foregoing scheme, the translation model is a sequence-to-sequence model, and the obtaining module is configured to: obtaining a word level hidden state of an encoder output end of the sequence-to-sequence model after the source sentence is input into the sequence-to-sequence model; and pooling the word level hidden state at the output end of the encoder to obtain the first feature vector.

In some embodiments of the present application, based on the foregoing solution, the obtaining module is configured to: obtaining a word level hidden state output by a first layer network of a decoder of the sequence-to-sequence model after the source sentence is input into the sequence-to-sequence model; and pooling the word level hidden state output by the first-layer network of the decoder to obtain the second feature vector.

In some embodiments of the present application, based on the foregoing, the calculation module is configured to: respectively converting the first feature vector and the second feature vector into the same vector space to obtain a third feature vector and a fourth feature vector after conversion; and calculating cosine similarity between the third feature vector and the fourth feature vector to obtain the alignment degree between the source sentence and the target sentence.

In some embodiments of the present application, based on the foregoing solution, the artificial intelligence based sentence translating apparatus further includes: the obtaining module is used for obtaining the translation sufficiency of the target sentence; the determination module is configured to: determining whether to take the target sentence as a translation result based on the alignment degree and the translation sufficiency.

In some embodiments of the present application, based on the foregoing, the determining module is configured to: reversely translating the target sentence to obtain a reversely translated result; calculating a reverse translation sufficiency of the target sentence based on the reverse translation result and the source sentence; determining the translation sufficiency based on a reverse translation sufficiency of the target sentence.

In some embodiments of the present application, based on the foregoing, the determining module is configured to: calculating a bilingual inter-translation quality score between the reverse translation result and the source sentence; and taking the bilingual inter-translation quality score as the reverse translation sufficiency of the target sentence.

In some embodiments of the present application, based on the foregoing, the determining module is configured to: acquiring a standard translation sentence of the source sentence; obtaining a vector packet sentence similarity between the standard translation sentence and the target sentence based on the standard translation sentence and the target sentence; and taking the similarity of the sentences in the vector packet as the translation sufficiency.

In some embodiments of the present application, based on the foregoing, the determining module is configured to: acquiring a fifth feature vector for representing the standard translated sentence; and calculating cosine similarity between the fifth feature vector and the second feature vector to obtain the similarity of the vector packet sentences between the standard translated sentence and the target sentence.

According to an aspect of embodiments of the present application, there is provided a computer-readable program medium storing computer program instructions which, when executed by a computer, cause the computer to perform the method of any one of the above.

According to an aspect of an embodiment of the present application, there is provided an electronic apparatus including: a processor; a memory having computer readable instructions stored thereon which, when executed by the processor, implement the method of any of the above.

The technical scheme provided by the embodiment of the application can have the following beneficial effects:

in the technical solutions provided in some embodiments of the present application, a target sentence corresponding to a source sentence after translation is obtained by inputting the source sentence to be translated into a translation model, and the target sentence is used as a candidate translation result. And acquiring a first characteristic vector corresponding to the source sentence and a second characteristic vector corresponding to the target sentence based on the translation model, and extracting the characteristics of the source sentence and the target sentence by the translation model. The first feature vector of the source sentence obtained through the translation model can better reflect the features of the source sentence; the second feature vector of the target sentence obtained through the translation model can reflect the features of the target sentence better. And calculating the alignment degree between the source sentence and the target sentence based on the first characteristic vector and the second characteristic vector, wherein the alignment degree between the source sentence and the target sentence can reflect the translation quality of the target sentence, and then determining the translation quality of the target sentence based on the alignment degree, and further determining whether the target sentence is used as a translation result, so that the obtained translation result is more accurate.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.

FIG. 1 shows a schematic diagram of an exemplary system architecture to which aspects of embodiments of the present application may be applied;

FIG. 2 schematically illustrates a flow diagram of an artificial intelligence based sentence translation method according to one embodiment of the present application;

FIG. 3 schematically illustrates a flow diagram of an artificial intelligence based sentence translation method according to one embodiment of the present application;

FIG. 4 schematically illustrates a process of obtaining the sufficiency of translation of a target sentence according to one embodiment of the present application;

FIG. 5 schematically illustrates a process of obtaining the sufficiency of translation of a target sentence according to one embodiment of the present application;

FIG. 6 is a graphical illustration of alignment and translation sufficiency according to an embodiment of the present application;

FIG. 7 schematically illustrates a block diagram of an artificial intelligence based sentence translation apparatus according to one embodiment of the present application;

FIG. 8 is a hardware diagram illustrating an electronic device according to an example embodiment.

Detailed Description

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.

The scheme provided by the embodiment of the application relates to artificial intelligence machine translation, and the inventor finds that the embedding in the machine translation implicitly encodes the syntax and semantic characteristics of a source end sentence and a target end sentence, improves the alignment degree of bilingual sentences, and can improve the quality of the machine translation.

Example embodiments will now be described more fully hereinafter with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.

Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the application. One skilled in the relevant art will recognize, however, that the subject matter of the present application can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and so forth. In other instances, well-known methods, devices, implementations, or operations have not been shown or described in detail to avoid obscuring aspects of the application.

The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.

The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.

Fig. 1 shows a schematic diagram of an exemplary system architecture 100 to which the technical solutions of the embodiments of the present application can be applied.

As shown in fig. 1, the system architecture 100 may include a terminal device 101 (the terminal device may be one or more of a smartphone, a tablet, a laptop, a desktop computer), a network 102, and a server 103. The network 102 is used to provide a medium for communication links between the terminal devices 101 and the server 103. Network 102 may include various connection types, such as wired communication links, wireless communication links, and so forth.

It should be understood that the number of terminal devices 101, networks 102, and servers 103 in fig. 1 is merely illustrative. There may be any number of terminal devices 101, networks 102, and servers, as desired for implementation. For example, the server 103 may be a server cluster composed of a plurality of servers.

In an embodiment of the present application, the server 103 obtains a target sentence corresponding to a source sentence after the source sentence is translated by inputting the source sentence to be translated into the translation model, and takes the target sentence as a candidate translation result. And acquiring a first characteristic vector corresponding to the source sentence and a second characteristic vector corresponding to the target sentence based on the translation model, and extracting the characteristics of the source sentence and the target sentence by the translation model. The first feature vector of the source sentence obtained through the translation model can better reflect the features of the source sentence; the second feature vector of the target sentence obtained through the translation model can reflect the features of the target sentence better. And calculating the alignment degree between the source sentence and the target sentence based on the first characteristic vector and the second characteristic vector, wherein the alignment degree between the source sentence and the target sentence can reflect the translation quality of the target sentence, and then determining the translation quality of the target sentence based on the alignment degree, and further determining whether the target sentence is used as a translation result, so that the obtained translation result is more accurate.

It should be noted that the artificial intelligence based sentence translation method provided in the embodiment of the present application is generally executed by the server 103, and accordingly, the artificial intelligence based sentence translation apparatus is generally disposed in the server 103. However, in other embodiments of the present application, the terminal device 101 may also have a similar function as the server 103, so as to execute the artificial intelligence based sentence translation method provided in the embodiments of the present application.

The implementation details of the technical solution of the embodiment of the present application are set forth in detail below:

FIG. 2 schematically illustrates a flow diagram of an artificial intelligence based sentence translation method, the execution subject of which may be a server, such as server 103 shown in FIG. 1, according to one embodiment of the present application.

Referring to fig. 2, the artificial intelligence based sentence translation method at least includes steps S210 to S240, which are described in detail as follows:

in step S210, a source sentence to be translated is input into the translation model to obtain a target sentence corresponding to the translated source sentence.

In one embodiment of the present application, the translation Model may be a statistical machine translation Model (SMT), a Long Short-Term Memory Model (LSTM), or a Sequence-to-Sequence Model (Seq 2 Seq).

In one embodiment of the present application, the translation model may be a Transform model, RNNsearch model, C2C-Seq2Seq model, or the like, in a sequence-to-sequence model.

In an embodiment of the present application, after a source sentence to be translated is input into a translation model, an output result of the translation model may be used as a target sentence corresponding to the translated source sentence.

In step S220, a first feature vector corresponding to the source sentence is obtained based on the translation model, and a second feature vector corresponding to the target sentence is obtained.

In one embodiment of the present application, the first feature vector may be a vector obtained from a translation model, and may represent features of a source sentence, where the features of the source sentence include word features and word-to-word relationship features included in the source sentence; the second feature vector may be a vector obtained from the translation model, and may represent features of the target sentence, where the features of the target sentence include word features and word-to-word relationship features included in the target sentence.

In an embodiment of the present application, a sequence-to-sequence model may be used to translate a source sentence, where the sequence-to-sequence model includes a coding layer and a decoding layer, and the source sentence is coded by the coding layer and then decoded by a decoding end to obtain a target sentence. The first feature vector corresponding to the source sentence can be obtained from the coding layer of the sequence-to-sequence model, and the second feature vector corresponding to the target sentence can be obtained from the decoding layer of the sequence-to-sequence model.

In this embodiment, in the process of processing the source sentence by the encoding layer, the word features and the relationship features between words in the source sentence can be comprehensively considered, so that the first feature vector corresponding to the source sentence acquired from the encoding layer can carry the word features and the relationship features between words in the source sentence; in the process of obtaining the target sentence, the decoding layer can comprehensively consider the word characteristics and the relation characteristics between the words in the target sentence, so that the second characteristic vector corresponding to the target sentence obtained from the decoding layer can carry the word characteristics and the relation characteristics between the words in the target sentence.

In one embodiment of the present application, when translating a source sentence using a sequence-to-sequence model, a word-level hidden state at an encoder output of the sequence-to-sequence model after inputting the source sentence into the sequence-to-sequence model may be obtained; and pooling the word level hidden state at the output end of the encoder to obtain a first feature vector.

In this embodiment, the word-level hidden state from the sequence to the encoder output end of the sequence model can carry word features and relationship features between words in the source sentence due to the encoding process of the translation model, and can be used as the first feature vector corresponding to the source sentence.

In an embodiment of the present application, the word level hidden states at the output end of the encoder may be averaged and pooled to obtain the first feature vector.

In one embodiment of the present application, when translating a source sentence using a sequence-to-sequence model, a word-level hidden state output by a first-layer network of a decoder of the sequence-to-sequence model after the source sentence is input into the sequence-to-sequence model may be obtained; and performing pooling treatment on the word level hidden state output by the first-layer network of the decoder to obtain a second feature vector.

In this embodiment, the word-level hidden state output by the first layer network of the decoder from the sequence to the sequence model can carry word features and relationship features between words in the target sentence as a result of the decoding process of the translation model, and can be used as the second feature vector corresponding to the target sentence.

In an embodiment of the present application, the word-level hidden states output by the first layer network of the decoder may be averaged and pooled to obtain the second feature vector.

With continued reference to fig. 2, in step S230, based on the first feature vector and the second feature vector, a degree of alignment between the source sentence and the target sentence is calculated.

In one embodiment of the present application, cosine similarity between the first feature vector and the second feature vector may be calculated, and the degree of alignment between the source sentence and the target sentence may be determined based on the cosine similarity.

In one embodiment of the present application, a cosine similarity between the first feature vector and the second feature vector may be used as an alignment degree between the source sentence and the target sentence.

In an embodiment of the present application, a correspondence between cosine similarity and alignment degree may be set, and the alignment degree between the source sentence and the target sentence is determined by finding the alignment degree corresponding to the cosine similarity between the first feature vector and the second feature vector.

In an embodiment of the present application, a correspondence between cosine similarity and alignment degree may be stored in a cosine similarity and alignment degree comparison table, and the alignment degree between the source sentence and the target sentence may be found by finding the cosine similarity and alignment degree comparison table.

In an embodiment of the application, a first threshold may be set for the cosine similarity, if the cosine similarity between the first feature vector and the second feature vector does not reach the first threshold, the alignment degree between the source sentence and the target sentence is not calculated, the target sentence is discarded, and the target sentence is screened according to the cosine similarity, thereby saving calculation resources.

In this embodiment, if the cosine similarity between the first feature vector and the second feature vector reaches the first threshold, the alignment degree between the source sentence and the target sentence is determined according to the cosine similarity between the first feature vector and the second feature vector.

In one embodiment of the present application, a degree of alignment between a source sentence and a target sentence may be calculated using a typical correlation analysis method.

In an embodiment of the present application, the first feature vector and the second feature vector may be converted into the same vector space, respectively, to obtain a third feature vector and a fourth feature vector after conversion; and calculating cosine similarity between the third feature vector and the fourth feature vector to obtain the alignment degree between the source sentence and the target sentence.

In this embodiment, since the third feature vector and the fourth feature vector are in the same vector space, the cosine similarity between the third feature vector and the fourth feature vector can more accurately reflect the alignment degree between the source sentence and the target sentence than the cosine similarity between the first feature vector and the second feature vector.

In one embodiment of the present application, the cosine similarity between the third feature vector and the fourth feature vector may be used as the alignment degree between the source sentence and the target sentence.

In an embodiment of the present application, a correspondence between cosine similarity and alignment degree may be set, and the alignment degree between the source sentence and the target sentence is determined by finding the alignment degree corresponding to cosine similarity between the third feature vector and the fourth feature vector.

In an embodiment of the application, a first threshold may be set for the cosine similarity, if the cosine similarity between the third feature vector and the fourth feature vector does not reach the first threshold, the alignment degree between the source sentence and the target sentence is not calculated, the target sentence is discarded, the target sentence is screened according to the cosine similarity, and the calculation resources are saved.

In this embodiment, if the cosine similarity between the third feature vector and the fourth feature vector reaches the first threshold, the alignment degree between the source sentence and the target sentence is determined according to the cosine similarity between the third feature vector and the fourth feature vector.

In step S240, it is determined whether to take the target sentence as a translation result based on the degree of alignment.

In an embodiment of the present application, a second threshold may be set for the alignment degree, and if the alignment degree does not reach the second threshold, the target sentence is discarded, and the target sentence is not used as the translation result.

In an embodiment of the present application, if the alignment degree reaches the second threshold, the target sentence may be used as the translation result.

In an embodiment of the present application, if the alignment degree does not reach the second threshold, a new target sentence may be retrieved.

In an embodiment of the present application, if the alignment degree does not reach the second threshold, the translation model may be retrained, and the steps S210 to S240 are repeatedly performed until the alignment degree between the source sentence and the target sentence reaches the second threshold, and the target sentence is used as a translation result, and the retraining of the translation model is stopped.

In the embodiment shown in fig. 2, a target sentence corresponding to a source sentence after the source sentence is translated is obtained by inputting the source sentence to be translated into the translation model, and the target sentence is used as a candidate translation result. And acquiring a first characteristic vector corresponding to the source sentence and a second characteristic vector corresponding to the target sentence based on the translation model, and extracting the characteristics of the source sentence and the target sentence by the translation model. The first feature vector of the source sentence obtained through the translation model can better reflect the features of the source sentence; the second feature vector of the target sentence obtained through the translation model can reflect the features of the target sentence better. And calculating the alignment degree between the source sentence and the target sentence based on the first characteristic vector and the second characteristic vector, wherein the alignment degree between the source sentence and the target sentence can reflect the translation quality of the target sentence, and then determining the translation quality of the target sentence based on the alignment degree, and further determining whether the target sentence is used as a translation result, so that the obtained translation result is more accurate.

FIG. 3 schematically illustrates a flow diagram of an artificial intelligence based sentence translation method, the execution subject of which may be a server, such as server 103 shown in FIG. 1, according to one embodiment of the present application.

Referring to fig. 3, the artificial intelligence based sentence translation method may include steps S310 to S340, which are described in detail as follows:

in step S310, a source sentence to be translated is input into the translation model to obtain a target sentence corresponding to the translated source sentence.

In step S320, based on the source sentence and the target sentence, a degree of alignment between the source sentence and the target sentence is calculated.

In step S330, the translation sufficiency of the target sentence is acquired.

In an embodiment of the present application, the process of obtaining the sufficiency of translation of the target sentence in S330 may include steps S410 to S430 shown in fig. 4, where fig. 4 schematically shows the process of obtaining the sufficiency of translation of the target sentence according to an embodiment of the present application, and the following is specifically described:

in step S410, the reverse translation of the target sentence results in a reverse translation result.

In one embodiment of the present application, a third-party translator may be used to reverse-translate a target sentence to obtain a reverse-translated result, wherein the degree of alignment between the input and output results of the third-party translator reaches a second threshold.

In step S420, based on the reverse translation result and the source sentence, the reverse translation sufficiency of the target sentence is calculated.

In one embodiment of the present application, a bilingual inter-translation quality score (BLEU score) between the reverse translation result and the source sentence may be calculated, and the bilingual inter-translation quality score is taken as the reverse translation sufficiency of the target sentence.

In an embodiment of the application, a reverse feature vector corresponding to a reverse translation result may be obtained, a similarity between the reverse feature vector and a first feature vector corresponding to a source sentence is calculated, and the similarity between the reverse feature vector and the first feature vector is taken as a reverse translation sufficiency of a target sentence.

In one embodiment of the present application, the similarity between the inverse eigenvector and the first eigenvector may be determined by calculating the Jackside similarity coefficient, the cosine similarity, the Minkowski distance, the Mahalanobis distance, the Hamming distance, the relative entropy, etc., between the inverse eigenvector and the first eigenvector.

In step S430, based on the reverse translation sufficiency of the target sentence, the translation sufficiency is determined.

In one embodiment of the present application, the reverse translation sufficiency of the target sentence may be taken as the translation sufficiency.

In an embodiment of the present application, a correspondence between the sufficiency of the reverse translation and the sufficiency of the translation may be set, and the correspondence between the sufficiency of the reverse translation and the sufficiency of the translation may be found according to the sufficiency of the reverse translation of the target sentence, so as to obtain the sufficiency of the translation.

In the embodiment shown in fig. 4, a new indicator for determining the sufficiency of translation of a target sentence is proposed: adequacy of reverse translation (ABBT), the Adequacy of translation of a target sentence is determined by calculating the Adequacy of reverse translation of the target sentence, the higher the Adequacy of reverse translation, the higher the Adequacy of translation. And reversely translating the target sentence, wherein if the target sentence is translated fully, the similarity between a reverse translation result obtained by reversely translating the target sentence and the source sentence is high, and the bilingual inter-translation quality score between the reverse translation result and the source sentence is also high, so that the obtained reverse translation sufficiency can be used for determining the translation sufficiency of the target sentence.

In an embodiment of the present application, the process of acquiring the sufficiency of translation of the target sentence in step S330 may include steps S510 to S530 as shown in fig. 5, where fig. 5 schematically illustrates the process of acquiring the sufficiency of translation of the target sentence according to an embodiment of the present application, and the following is specifically described:

in step S510, a standard translated sentence of the source sentence is obtained.

In one embodiment of the present application, the standard translated sentence is obtained by the source sentence through the third-party translator, and is the most accurate translation result corresponding to the source sentence that can be obtained.

In an embodiment of the application, a source sentence can be translated by a plurality of third-party translators to obtain a plurality of third-party translation results, and the third-party translation result with the best alignment degree and the highest reverse translation sufficiency is found by comparing the alignment degree and the reverse translation sufficiency of the plurality of third-party translation results to serve as a standard translation result.

In step S520, based on the standard translated sentence and the target sentence, a vector packet sentence similarity between the standard translated sentence and the target sentence is obtained.

In an embodiment of the present application, a fifth feature vector for representing a standard translated sentence may be obtained, and cosine similarity between the fifth feature vector and the second feature vector is calculated to obtain a vector packet sentence similarity between the standard translated sentence and the target sentence.

In one embodiment of the present application, a fifth feature vector representing a standard translated sentence may be obtained from a third party translator.

In an embodiment of the present application, if the third-party translator is a sequence-to-sequence model, the fifth feature vector of the standard translated sentence may be obtained from a coding layer of the sequence-to-sequence model.

In an embodiment of the present application, the fifth feature vector used for representing the standard translated sentence may be a word frequency vector of the standard translated sentence, or may be calculated from the standard translated sentence, and the fifth feature vector is not limited herein.

In step S530, the vector package sentence similarity is taken as the translation sufficiency.

In the embodiment of fig. 5, by obtaining the standard translated sentence of the source sentence, if the target sentence is translated sufficiently, the similarity between the target sentence and the standard translated sentence should be high, and therefore, calculating the vector packet sentence similarity between the target sentence and the standard translated sentence can determine the translation sufficiency of the target sentence, wherein the higher the vector packet sentence similarity is, the higher the translation sufficiency is.

In one embodiment of the present application, the translation sufficiency of the target sentence may be determined jointly based on the reverse translation sufficiency and the vector package sentence similarity between the standard translated sentence and the target sentence.

In an embodiment of the present application, weights may be set for the reverse translation sufficiency and the vector packet sentence similarity, respectively, and a weighted sum of the reverse translation sufficiency and the vector packet sentence similarity is calculated to obtain the translation sufficiency of the target sentence.

With continued reference to fig. 3, in step S340, it is determined whether to take the target sentence as a translation result based on the degree of alignment and the translation sufficiency.

In an embodiment of the present application, a third threshold may be set for the sufficiency of translation, and if the alignment degree between the source sentence and the target sentence reaches the second threshold and the sufficiency of translation of the target sentence reaches the third threshold, the target sentence is used as the translation result.

In one embodiment of the present application, weights may be set for the alignment degree and the translation sufficiency, respectively, and whether to take the target sentence as the translation result may be determined based on a weighted sum of the alignment degree and the translation sufficiency.

In one embodiment of the present application, a fourth threshold may be set for the weighted sum of the alignment degree and the translation sufficiency, and if the weighted sum of the alignment degree and the translation sufficiency reaches the fourth threshold, the target sentence is taken as the translation result.

In one embodiment of the present application, if it is determined that the target sentence cannot be used as a translation result by calculating the alignment degree between the source sentence and the target sentence and the translation sufficiency of the target sentence, the target sentence may be retrieved.

In an embodiment of the present application, if it is determined that the target sentence cannot be used as the translation result by calculating the alignment degree between the source sentence and the target sentence and the translation sufficiency of the target sentence, the translation model may be retrained, and steps S310 to S340 may be re-executed until the target sentence can be used as the translation result, and the training of the translation model is stopped.

In the embodiment of fig. 3, by determining whether to take the target sentence as the translation result in consideration of the alignment degree between the source sentence and the target sentence and the translation sufficiency of the target sentence, the obtained translation result is more accurate.

In an embodiment of the present application, the artificial intelligence based sentence translation method of the present application may be used for machine translation, and a first sentence to be translated is input into a machine for translation, so as to obtain a translated second sentence output by the machine translation. When the artificial intelligence based sentence translation method is applied to measure the translation quality of the second sentence, the relationship between the alignment degree and the translation quality in the method can be further verified. Since the second sentence with higher translation sufficiency is translated more sufficiently, the translation quality of the second sentence is better, the translation quality is expressed in terms of the translation sufficiency, the relationship between the alignment degree and the translation sufficiency is obtained, and the relationship between the alignment degree and the translation quality can be verified.

In one embodiment of the present application, three language pairs may be obtained using a Transform model: the first sentence is in english and the second sentence is in german, the first sentence is in english and the second sentence is in french, the first sentence is in chinese and the second sentence is in english. The method for translating the sentences based on the artificial intelligence calculates the alignment degree between the first sentence and the second sentence, and respectively obtains the relation between the alignment degree and the translation sufficiency in the three language pairs.

In this embodiment, the step of calculating the alignment degree between the first sentence and the second sentence by using the artificial intelligence based sentence translation method of the present application specifically includes: after the feature vector of the first sentence and the feature vector of the second sentence are obtained based on a Transform model, the feature vector of the first sentence and the feature vector of the second sentence are converted into the same vector space, the feature vector of the third sentence and the feature vector of the fourth sentence are obtained, and the cosine similarity between the feature vector of the third sentence and the feature vector of the fourth sentence is used as the alignment degree between the first sentence and the second sentence.

The correlation between the degree of alignment and the sufficiency of translation obtained by this example is shown in Table 1:

TABLE 1

As can be seen from table 1, in all three language pairs, the correlation between the alignment degree and the reverse translation sufficiency exceeds 0.9, and the average value of the correlation between the alignment degree and the reverse translation sufficiency reaches 0.95; the correlation between the alignment degree and the reverse translation adequacy exceeds 0.9, the average value of the correlation between the alignment degree and the reverse translation adequacy reaches 0.9, and the alignment degree and the translation adequacy can be determined to be highly correlated, and the alignment degree and the translation quality are highly correlated.

In this embodiment, at different stages of training of the Transform model, the similarity curves of the alignment degree and the translation sufficiency are as shown in fig. 6, fig. 6 schematically shows a curve diagram of the alignment degree and the translation sufficiency of an embodiment of the present application, and it can be known from fig. 6 that the curves of the alignment degree of the three language pairs, the curve of the reverse translation sufficiency of the second sentence, and the curve of the similarity of the vector package sentences are close to each other, and as the number of steps of training of the model increases, the curves of the alignment degree of the three language pairs are approximately the same as the curves of the reverse translation sufficiency of the second sentence and the similarity of the vector package sentences, and both tend to rise upward, and gradually approach to each other, it can be determined that the alignment degree is highly related to the translation sufficiency, and the alignment degree is highly related to the translation quality.

In an embodiment of the present application, the sentence translation method based on artificial intelligence of the present application may be used to train a translation model, and if an output result of the translation model meets a requirement, the translation model training is finished, so that a translation model with higher translation quality can be obtained.

The following describes embodiments of an apparatus of the present application, which may be used to perform the artificial intelligence based sentence translation method in the above embodiments of the present application. For details that are not disclosed in the embodiments of the apparatus of the present application, please refer to the embodiments of the artificial intelligence based sentence translation method described above in the present application.

FIG. 7 schematically illustrates a block diagram of an artificial intelligence based sentence translation apparatus according to one embodiment of the present application.

Referring to fig. 7, an artificial intelligence based sentence translation apparatus 700 according to an embodiment of the present application includes a translation module 701, an acquisition module 702, a calculation module 703, and a determination module 704.

In some embodiments of the present application, based on the foregoing scheme, the translation module 701 is configured to input a source sentence to be translated into a translation model to obtain a target sentence corresponding to the translated source sentence; the obtaining module 702 is configured to obtain a first feature vector corresponding to a source sentence based on a translation model, and obtain a second feature vector corresponding to a target sentence; the calculation module 703 is configured to calculate an alignment degree between the source sentence and the target sentence based on the first feature vector and the second feature vector; the determining module 704 is configured to determine whether to use the target sentence as the translation result based on the alignment degree.

In some embodiments of the present application, based on the foregoing scheme, the translation model is a sequence-to-sequence model, and the obtaining module 702 is configured to: obtaining a word level hidden state of an encoder output end of the sequence-to-sequence model after a source sentence is input into the sequence-to-sequence model; and pooling the word level hidden state at the output end of the encoder to obtain a first feature vector.

In some embodiments of the present application, based on the foregoing solution, the obtaining module 702 is configured to: obtaining a word level hidden state output by a first layer network of a decoder of the sequence-to-sequence model after a source sentence is input into the sequence-to-sequence model; and performing pooling treatment on the word level hidden state output by the first-layer network of the decoder to obtain a second feature vector.

In some embodiments of the present application, based on the foregoing solution, the calculation module 703 is configured to: respectively converting the first feature vector and the second feature vector into the same vector space to obtain a third feature vector and a fourth feature vector after conversion; and calculating cosine similarity between the third feature vector and the fourth feature vector to obtain the alignment degree between the source sentence and the target sentence.

In some embodiments of the present application, based on the foregoing solution, the artificial intelligence based sentence translating apparatus further includes: the obtaining module is used for obtaining the translation sufficiency of the target sentence; the determination module 704 is configured to: and determining whether to take the target sentence as a translation result or not based on the alignment degree and the translation sufficiency.

In some embodiments of the present application, based on the foregoing, the determining module 704 is configured to: reversely translating the target sentence to obtain a reversely translated result; calculating the reverse translation sufficiency of the target sentence based on the reverse translation result and the source sentence; based on the reverse translation sufficiency of the target sentence, translation sufficiency is determined.

In some embodiments of the present application, based on the foregoing, the determining module 704 is configured to: calculating bilingual inter-translation quality scores between the reverse translation results and the source sentences; and taking the bilingual inter-translation quality score as the reverse translation sufficiency of the target sentence.

In some embodiments of the present application, based on the foregoing, the determining module 704 is configured to: obtaining a standard translation sentence of a source sentence; obtaining a vector packet sentence similarity between the standard translation sentence and the target sentence based on the standard translation sentence and the target sentence; and taking the similarity of the sentences in the vector packet as the translation sufficiency.

In some embodiments of the present application, based on the foregoing, the determining module 704 is configured to: acquiring a fifth feature vector for representing a standard translation sentence; and calculating cosine similarity between the fifth feature vector and the second feature vector to obtain the similarity of the vector packet sentences between the standard translated sentences and the target sentences.

As will be appreciated by one skilled in the art, aspects of the present application may be embodied as a system, method or program product. Accordingly, various aspects of the present application may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.

An electronic device 80 according to this embodiment of the present application is described below with reference to fig. 8. The electronic device 80 shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 8, the electronic device 80 is in the form of a general purpose computing device. The components of the electronic device 80 may include, but are not limited to: the at least one processing unit 81, the at least one memory unit 82, a bus 83 connecting different system components (including the memory unit 82 and the processing unit 81), and a display unit 84.

Wherein the storage unit stores program code that can be executed by the processing unit 81 such that the processing unit 81 performs the steps according to various exemplary embodiments of the present application described in the section "example methods" above in this specification.

The storage unit 82 may include readable media in the form of volatile storage units, such as a random access storage unit (RAM)821 and/or a cache storage unit 822, and may further include a read only storage unit (ROM) 823.

The storage unit 82 may also include a program/utility 824 having a set (at least one) of program modules 825, such program modules 825 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Bus 83 may be any of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 80 may also communicate with one or more external devices (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 80, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 80 to communicate with one or more other computing devices. Such communication may be through input/output (I/O) interfaces 85. Also, the electronic device 80 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via the network adapter 86. As shown, the network adapter 86 communicates with the other modules of the electronic device 80 via the bus 83. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 80, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to make a computing device (which can be a personal computer, a server, a terminal device, or a network device, etc.) execute the method according to the embodiments of the present application.

There is also provided, in accordance with an embodiment of the present application, a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, various aspects of the present application may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the present application described in the "exemplary methods" section above of this specification, when the program product is run on the terminal device.

In some embodiments of the present application, a program product for implementing the above method of embodiments of the present application is provided, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

Furthermore, the above-described figures are merely schematic illustrations of processes involved in methods according to exemplary embodiments of the present application, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.

It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims

1. A sentence translation method based on artificial intelligence is characterized by comprising the following steps:

inputting a source sentence to be translated into a translation model to obtain a target sentence corresponding to the source sentence after the source sentence is translated;

acquiring a first feature vector corresponding to the source sentence based on the translation model, and acquiring a second feature vector corresponding to the target sentence;

calculating an alignment degree between the source sentence and a target sentence based on the first feature vector and the second feature vector;

determining whether to take the target sentence as a translation result based on the alignment degree.

2. The artificial intelligence based sentence translation method of claim 1, wherein the translation model is a sequence-to-sequence model,

the obtaining of the first feature vector corresponding to the source sentence based on the translation model includes:

obtaining a word level hidden state of an encoder output end of the sequence-to-sequence model after the source sentence is input into the sequence-to-sequence model;

and pooling the word level hidden state at the output end of the encoder to obtain the first feature vector.

3. The method for sentence translation based on artificial intelligence of claim 2, wherein the obtaining a second feature vector corresponding to the target sentence comprises:

obtaining a word level hidden state output by a first layer network of a decoder of the sequence-to-sequence model after the source sentence is input into the sequence-to-sequence model;

and pooling the word level hidden state output by the first-layer network of the decoder to obtain the second feature vector.

4. The artificial intelligence based sentence translation method of claim 1, wherein the calculating the degree of alignment between the source sentence and the target sentence based on the first feature vector and the second feature vector comprises:

respectively converting the first feature vector and the second feature vector into the same vector space to obtain a third feature vector and a fourth feature vector after conversion;

and calculating cosine similarity between the third feature vector and the fourth feature vector to obtain the alignment degree between the source sentence and the target sentence.

5. The artificial intelligence based sentence translation method of claim 1, the method further comprising:

acquiring the translation sufficiency of the target sentence;

the determining whether to take the target sentence as a translation result based on the alignment degree comprises:

determining whether to take the target sentence as a translation result based on the alignment degree and the translation sufficiency.

6. The artificial intelligence based sentence translation method according to claim 5, wherein the obtaining of the translation sufficiency of the target sentence comprises:

reversely translating the target sentence to obtain a reversely translated result;

calculating a reverse translation sufficiency of the target sentence based on the reverse translation result and the source sentence;

determining the translation sufficiency based on a reverse translation sufficiency of the target sentence.

7. The artificial intelligence based sentence translation method of claim 6, wherein the deriving the sufficiency of the reverse translation of the target sentence based on the reverse translation result and the source sentence comprises:

calculating a bilingual inter-translation quality score between the reverse translation result and the source sentence;

and taking the bilingual inter-translation quality score as the reverse translation sufficiency of the target sentence.

8. The artificial intelligence based sentence translation method according to claim 5, wherein the obtaining of the translation sufficiency of the target sentence comprises:

acquiring a standard translation sentence of the source sentence;

obtaining a vector packet sentence similarity between the standard translation sentence and the target sentence based on the standard translation sentence and the target sentence;

and taking the similarity of the sentences in the vector packet as the translation sufficiency.

9. The artificial intelligence based sentence translation method of claim 8, wherein the deriving a vector package sentence similarity between the standard translated sentence and the target sentence based on the standard translated sentence and the target sentence comprises:

acquiring a fifth feature vector for representing the standard translated sentence;

and calculating cosine similarity between the fifth feature vector and the second feature vector to obtain the similarity of the vector packet sentences between the standard translated sentence and the target sentence.

10. An artificial intelligence-based sentence translation apparatus, comprising:

the translation module is used for inputting a source sentence to be translated into a translation model to obtain a target sentence corresponding to the translated source sentence;

an obtaining module, configured to obtain a first feature vector corresponding to the source sentence based on the translation model, and obtain a second feature vector corresponding to the target sentence;

a calculation module, configured to calculate an alignment degree between the source sentence and a target sentence based on the first feature vector and the second feature vector;

and the determining module is used for determining whether the target sentence is used as a translation result or not based on the alignment degree.