CN116681089A

CN116681089A - Model training method, translation method and translation system

Info

Publication number: CN116681089A
Application number: CN202310666692.2A
Authority: CN
Inventors: 王轲; 张金鹏; 肖妮妮; 张昱琪; 赵宇
Original assignee: Alibaba China Co Ltd
Current assignee: Alibaba China Co Ltd
Priority date: 2023-06-06
Filing date: 2023-06-06
Publication date: 2023-09-01

Abstract

The application provides a model training method, a translation method and a translation system. The model training method comprises the following steps: acquiring a sample text, wherein the sample text is a text of a first language; extracting text fragments in the sample text, and acquiring a plurality of translation fragments corresponding to the text fragments, wherein the translation fragments are translation results of the text fragments, and the translation fragments are texts in a second language; acquiring a label text corresponding to the sample text, wherein the label text is a translation result of the sample text, and the label text is a text of a second language; the method and the device have the advantages that the sample text, the text fragments, the plurality of translation fragments and the label text are adopted to train the translation model, and the sample text, the text fragments and the plurality of translation fragments are used as inputs of the translation model.

Description

Model training method, translation method and translation system

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a model training method, a translation method and a translation system.

Background

In the cross-language communication process, the translation error greatly influences the communication quality. If a word corresponds to a plurality of translation results in a text, the problem of incorrect translation of the word is easy to occur, and if the word is key information in the text, the cross-language communication quality is greatly affected. For example, the chinese translation corresponding to the english word "air way" has "respiratory tract", "air route" and "airway", and the english sentence "The virus enters the airway cells and causes pathological changes" containing "air way" may be translated into "virus enters airway cells and causes pathological changes" using the related art, and such translation is not accurate.

Based on the above problems, a scheme for realizing accurate text translation is needed.

Disclosure of Invention

The application provides a model training method, a translation method and a translation system, which are used for training to obtain a translation model and can improve the text translation accuracy.

A first aspect of an embodiment of the present application provides a model training method, including: acquiring a sample text, wherein the sample text is a text of a first language; extracting text fragments in the sample text, and acquiring a plurality of translation fragments corresponding to the text fragments, wherein the translation fragments are translation results of the text fragments, and the translation fragments are texts in a second language; acquiring a label text corresponding to the sample text, wherein the label text is a translation result of the sample text, and the label text is a text of a second language; the translation model is trained using the sample text, the text segment, the plurality of translation segments, and the tag text, the sample text, the text segment, and the plurality of translation segments being inputs to the translation model.

A second aspect of an embodiment of the present application provides a translation method, including: acquiring a text to be translated; extracting a target text fragment in a text to be translated, and acquiring a plurality of translation fragments corresponding to the target text fragment; inputting a text to be translated, a target text segment and a plurality of translation segments into a translation model to obtain a translation text corresponding to the text to be translated, wherein the text to be translated is a text in a first language, the translation segment and the translation text are texts in a second language, and the translation model is obtained according to the model training method of the first aspect.

A third aspect of an embodiment of the present application provides a translation method, including: acquiring a text to be translated, and sending the text to be translated to a server; and receiving the translation text of the text to be translated, which is sent by the server, wherein the translation text is obtained according to the translation method of the second aspect.

The fourth aspect of the embodiment of the application provides a model training method, which is applied to a cloud server and comprises the following steps: acquiring a sample text, wherein the sample text is a text of a first language; extracting text fragments in the sample text, and acquiring a plurality of translation fragments corresponding to the text fragments, wherein the translation fragments are translation results of the text fragments, and the translation fragments are texts in a second language; acquiring a label text corresponding to the sample text, wherein the label text is a translation result of the sample text, and the label text is a text of a second language; the translation model is trained using the sample text, the text segment, the plurality of translation segments, and the tag text, the sample text, the text segment, and the plurality of translation segments being inputs to the translation model.

A fifth aspect of an embodiment of the present application provides a translation system, including:

the system comprises a cloud server and terminal equipment, wherein a translation model is deployed on the cloud server;

The terminal equipment is used for acquiring the text to be translated and sending the text to be translated to the cloud server;

the cloud server is used for acquiring the text to be translated; extracting a target text fragment in a text to be translated, and acquiring a plurality of translation fragments corresponding to the target text fragment; inputting a text to be translated, a target text fragment and a plurality of translation fragments into a translation model to obtain a translation text corresponding to the text to be translated, wherein the text to be translated is a text in a first language, the translation fragment and the translation text are texts in a second language, and the translation model is obtained according to the model training method of any one of the above.

The terminal equipment is used for receiving the translation text of the text to be translated, which is sent by the cloud server.

A sixth aspect of an embodiment of the present application provides an electronic device, including: a processor, a memory and a computer program stored on the memory and executable on the processor, the processor implementing the model training method as in the first and/or fourth aspect and/or the translation method of the second or third aspect when the computer program is executed.

A seventh aspect of the embodiments of the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to implement the model training method of the first and/or fourth aspects, and/or the translation method of the second or third aspects.

The embodiment of the application is applied to a translation scene, and a sample text is obtained, wherein the sample text is a text of a first language; extracting text fragments in the sample text, and acquiring a plurality of translation fragments corresponding to the text fragments, wherein the translation fragments are translation results of the text fragments, and the translation fragments are texts in a second language; acquiring a label text corresponding to the sample text, wherein the label text is a translation result of the sample text, and the label text is a text of a second language; the translation model is trained by adopting the sample text, the text fragment, the plurality of translation fragments and the tag text, and the sample text, the text fragment and the plurality of translation fragments are input into the translation model, so that the translation model can be trained to accurately translate the text.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:

FIG. 1 is a schematic diagram of a translation system provided in an exemplary embodiment of the present application;

FIG. 2 is a flow chart of steps of a model training method according to an exemplary embodiment of the present application;

FIG. 3 is a flowchart illustrating steps of another model training method according to an exemplary embodiment of the present application;

FIG. 4 is a schematic diagram of a translation model provided by an exemplary embodiment of the present application;

FIG. 5 is a schematic diagram of an identification sub-model provided in an exemplary embodiment of the present application;

FIG. 6 is a block diagram of a model training apparatus according to an exemplary embodiment of the present application;

fig. 7 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Automatic translation of bilingual terms has a positive pushing effect on downstream tasks such as machine translation, cross-language retrieval, cross-language text analysis and the like. With the rapid development of internationalization, the demands for information exchange between merchants and buyers in various countries are more frequent and urgent, and the correctness of the mutual translation of terms serving as key information in cross-language communication of buyers and sellers seriously affects the communication exchange between users in various countries. Wherein, constructing a high-quality term automatic translation model requires automation to complete the term disambiguation task. There are a large number of one-to-many cases in bilingual terms, and the meaning of the same term in different contexts may vary widely. Taking the english term "airway" as an example, it can be translated into multiple meanings of "respiratory tract", "air route" and "airway", and for english sentence "The virus enters the airway cells and causes pathological changes", the correct chinese translation of the term "airway" should be "respiratory tract", which would seriously affect the user experience if the model translated it into "airway". Such incorrect translations may mislead the user to sell or purchase products on the e-commerce platform during e-commerce transactions. Although accurate translation of terms is important in machine translation in the e-commerce field, there is little work involved in term disambiguation. Current technology focuses on using pre-specified bilingual terms for sentence-level machine translation and does not explore the one-to-many phenomenon in terms in depth. Currently, the term translation methods can be divided into two categories, namely, the traditional Beam Search (Beam Search) is improved, and a forced decoding strategy is introduced; and secondly, adjusting the model input by using a data enhancement mode.

Specifically, the grid bundle search is the most typical forced decoding-based term translation method, and compared with the conventional bundle search, the grid bundle search adds a dimension to terms for marking the number of words of terms which have been generated, thereby expanding the bundle search into the form of a grid. Assuming that the number of the words of the terms is C, the grid bundle search maintains C+1 groups for storing candidate translations meeting the number of the words of the different terms, and finally, sentences with highest scores are selected from the candidate translations of the C+1 groups to be used as translation results. Since the lattice beam search adds an extra dimension, its decoding complexity grows linearly with the number of term words. The data enhancement technology can achieve a certain degree of term intervention effect by only using standard beam search without modifying a model structure, and has high decoding speed. For example, the term intervention is performed using a character replacement method, by replacing terms in a sentence to be sampled with corresponding translated terms by means of a priori term dictionary, and using the term dictionary to train a translation model. In the use stage of a translation model, people need to replace the terms in the sentences to be translated with appointed translations in advance and then translate, wherein the most important shortcoming of the data enhancement method is that the success rate of the term translation is limited and the translation accuracy is low.

The above two types of work consider using pre-specified bilingual terms for sentence-level machine translation, but do not directly consider the case where there are multiple meanings of the terms. Based on the problem, the application provides a model training method, a translation method and a translation system, which comprise the steps of obtaining a sample text, wherein the sample text is a text of a first language; extracting text fragments in the sample text, and acquiring a plurality of translation fragments corresponding to the text fragments, wherein the translation fragments are translation results of the text fragments, and the translation fragments are texts in a second language; acquiring a label text corresponding to the sample text, wherein the label text is a translation result of the sample text, and the label text is a text of a second language; the translation model is trained using the sample text, the text segment, the plurality of translation segments, and the tag text, the sample text, the text segment, and the plurality of translation segments being inputs to the translation model. According to the application, under the condition that a pair of a plurality of translation fragments of a term (text fragment) is considered, a translation model capable of accurately translating the text is obtained through training, and the translation quality of 'electronic market scene core commodity information' and 'key information in cross-language communication of buyers' can be better improved, so that the conversion rate under the electronic market scene, GMV (Gross Merchandise Volume, the transaction amount of a website) and other indexes are improved.

In the present embodiment, the execution apparatus of the model training method is not limited. Alternatively, the model training method may implement an overall model training method by means of a cloud computing system. For example, the model training method may be applied to a cloud server to run various neural network models by virtue of resources on the cloud; compared with the cloud end application, the model training method can be applied to the server-side equipment such as a conventional server, a cloud server or a server array.

Referring to fig. 1, for an application scenario of the present application, the present application provides a translation system, including a cloud server 11 and a terminal device 12, where a translation model is deployed on the cloud server;

the terminal device 12 is configured to obtain a text to be translated, and send the text to be translated to the cloud server;

the cloud server 11 is used for acquiring the text to be translated; extracting a target text fragment in a text to be translated, and acquiring a plurality of translation fragments corresponding to the target text fragment; inputting a text to be translated, a target text fragment and a plurality of translation fragments into a translation model to obtain a translation text corresponding to the text to be translated, wherein the text to be translated is a text in a first language, the translation fragment and the translation text are texts in a second language, and the translation model is obtained according to the model training method;

The terminal device 12 is configured to receive the translated text of the text to be translated sent by the cloud server.

In addition, in the application scene of the application, under the translation scene, such as the voice generated by live broadcasting, voice conference, interview and impulse speech of an electronic commerce, the text to be translated is obtained after being identified by a voice identification technology, the identified text can be provided for a user after being translated by a translation model, for example, the voice is English, and the English needs to be translated into Chinese for providing for the user, and the English needs to be accurately translated and provided for the user so as to be convenient for the user to obtain accurate information.

The foregoing is merely an exemplary application scenario of the present application, and the present application may also be applied to other translation scenarios, which are not limited herein.

The following describes in detail the technical solutions provided by the embodiments of the present application with reference to the accompanying drawings.

Fig. 2 is a flowchart of steps of a model training method according to an exemplary embodiment of the present application. The model training method as shown in fig. 2 specifically comprises the following steps:

s201, acquiring a sample text.

In an embodiment of the present application, the sample text is text in a first language. Wherein the sample text may be a sentence text or a piece of text. The first language is English, chinese or other language. For example, "The virus enters the airway cells and causes pathological changes" is sample text in english.

S202, extracting text fragments in the sample text, and acquiring a plurality of translation fragments corresponding to the text fragments.

The translation fragment is a translation result of the text fragment, and the translation fragment is a text of the second language.

In the embodiment of the application, the text segment is a preset word or phrase in the sample text, and the text segment is provided with a plurality of translation segments. For example, "airway" in the sample text "The virus enters the airway cells and causes pathological changes" is a text segment, where if the second language is chinese, the plurality of translated segments corresponding to "airway" are "respiratory tract", "air route", and "airway", respectively.

Further, the sample text may include a plurality of text fragments, for example, the sample text "The virus enters the airway cells and causes pathological changes" also includes a text fragment "structural" corresponding to a plurality of translated fragments being "irrational", "pathological" and "pathological", respectively.

S203, acquiring a label text corresponding to the sample text.

The label text is a translation result of the sample text, and the label text is a text of a second language. In an embodiment of the application, the tag text is an accurate translation of the sample text. For example, the sample text "The virus enters the airway cells and causes pathological changes" corresponds to the label text "virus enters the respiratory tract cells and causes pathological changes".

S204, training a translation model by using the sample text, the text fragments, the plurality of translation fragments and the label text.

Wherein the sample text, the text segment, and the plurality of translation segments are inputs to a translation model.

In the embodiment of the application, a sample text, a text fragment and a plurality of translation fragments are input into a translation model, the translation model translates the sample text based on the text fragment and the plurality of translation fragments to obtain a predicted translation result, and the translation model is adjusted according to the predicted translation result and a first loss value of a tag text, so that the translation model is trained.

Further, in the embodiment of the present application, the tag text includes a target translation segment, and the target translation segment is one of a plurality of translation segments. In the embodiment of the application, when the translation model is trained, the translation model learning is based on the sample text and the plurality of translation fragments, so that the accurate target translation fragments corresponding to the sample text can be learned, the disambiguation purpose is achieved, and the translation model can be trained to accurately translate the text.

According to the embodiment of the application, the sample text is obtained, and the sample text is the text of the first language; extracting text fragments in the sample text, and acquiring a plurality of translation fragments corresponding to the text fragments, wherein the translation fragments are translation results of the text fragments, and the translation fragments are texts in a second language; acquiring a label text corresponding to the sample text, wherein the label text is a translation result of the sample text, and the label text is a text of a second language; the translation model is trained by adopting the sample text, the text fragment, the plurality of translation fragments and the tag text, and the sample text, the text fragment and the plurality of translation fragments are input into the translation model, so that the translation model can be trained to accurately translate the text.

Referring to fig. 3, a flowchart of steps of another model training method is provided in an exemplary embodiment of the present application. The model training method specifically comprises the following steps as shown in fig. 3:

s301, acquiring a sample text.

The specific implementation of this step refers to S201, which is not limited.

S302, acquiring a preset glossary.

Wherein the preset glossary comprises a plurality of terms and a plurality of translation fragments corresponding to the terms. The term is a word or phrase.

In the embodiment of the application, the preset glossary containing different terms can be set according to different application scenes. For example, in an e-commerce scenario, terms may be product name, product material, product style, and the like. For example, if the term is "television", the term corresponds to a plurality of translated segments that are "television", "television program", "television system", and "television", respectively. In the financial field, the term may be a term of art of medicine, such as the term "currency", which corresponds to a plurality of translated segments of "currency", "universal" and "popular", respectively. Also for example, in the medical field, the term may be a term of art in the medical field, such as the plurality of translated segments corresponding to the term "air way" being "respiratory tract", "air route" and "airway", respectively.

S303, extracting text fragments from the sample text based on a preset glossary.

Wherein the text segment is a term in a preset glossary. In the embodiment of the application, the text fragments can be extracted from the sample text by referring to the preset glossary.

For example, if the sample text is "The virus enters the airway cells and causes pathological changes", and "airway" in the sample text is a term in the preset glossary, it may be determined that "airway" is a text segment.

S304, determining a plurality of translation fragments corresponding to the text fragments in a preset glossary.

In the embodiment of the application, after determining the text segment, a plurality of translation segments corresponding to the term can be determined in a preset glossary to be the translation segments of the text segment.

Illustratively, referring to the preset glossary of table 1, each of the terms corresponds to a plurality of translated segments. When determining that a text segment is the term "air passage", the corresponding plurality of translated segments are "respiratory tract", "air route", and "airway", respectively.

TABLE 1

Terminology	Translating fragments
		pathological	"irrational", "pathological" and "pathological" conditions "
airway	"airway", "air course" and "airway ”
		anatomy	"anatomy", "human body" and "profile"
internal medicine	"internal science", "physician" and "oral medicine"

S305, acquiring a label text corresponding to the sample text.

The specific implementation procedure of this step refers to S203, and will not be described here again.

S306, inputting the sample text, the text fragments and the plurality of translation fragments into an identification sub-model, and identifying the identification sub-model from the plurality of translation fragments based on the sample text to obtain predicted translation fragments corresponding to the text fragments.

Wherein the translation model comprises: the sub-model and the translation sub-model are identified. The predicted translated segment is one of a plurality of translated segments.

Referring to fig. 4, the recognition sub-model includes: the text encoder and the context encoder input the sample text, the text fragments and the plurality of translation fragments into a recognition sub-model, and the recognition sub-model recognizes the predicted translation fragments corresponding to the text fragments in the plurality of translation fragments based on the sample text, and the method comprises the following steps: splicing the text fragments and the sample text to obtain a spliced text, and inputting the spliced text into a context encoder for encoding to obtain a first encoding vector; inputting the translation fragments into a text encoder for encoding aiming at the translation fragments in the plurality of translation fragments to obtain second encoding vectors corresponding to the translation fragments; determining, for a second encoding vector of the plurality of second encoding vectors, a similarity of the second encoding vector and the first encoding vector; and determining the translation segment corresponding to the second coding vector with the maximum similarity as a predicted translation segment.

Further, the text encoder encodes a plurality of translation fragments, the context encoder encodes the spliced fragments, then calculates the similarity between the first encoding vector and each second encoding vector in the common representation space, and selects the translation fragment with the largest similarity as the prediction translation fragment.

In the embodiment of the application, a translation segment is encoded into a second encoding vector. Further, the similarity between the first code vector and the second code vector is expressed by cosine distance, and the nearest cosine distance with the first code vector is the maximum similarity.

Further, referring to fig. 5, two transducers (a neural network) encoder layers and a second adaptation layer (adaptation) are mapped together to stack a text encoder. For the translated segment, a preset flag [ CLS ] is added before the translated segment, and the hidden state of [ CLS ] is used as its representation, as in fig. 5, the translated segment is added before the respiratory tract, the air course, and the ventilation tract. And adding a serial number identifier to each word and/or word in the translation fragment, wherein the serial number identifier corresponding to the preset mark [ CLS ] is 0. The context encoder is also formed by stacking two transducer encoder layers and a first adaptation layer, as in the text encoder. The spliced text is obtained by splicing the text fragments and the sample text by adding a preset mark [ SEP ], and the text fragments in the sample text are shielded by [ MASK ], so that the context encoder can encode the context better. The context encoder also takes as output representation the hidden state of [ CLS ], masking the text fragment "air" with [ MASK ] in fig. 5.

S307, inputting the sample text, the text fragment and the predictive translation fragment into a translation sub-model, and translating the sample text by the translation sub-model based on the text fragment and the predictive translation fragment to obtain a predictive translation result corresponding to the sample text.

Referring to fig. 4, the translation sub-model may translate the sample text to obtain a predicted translation result. Wherein the translator model may be an LCNMT (dictionary constrained neural machine translation model).

S308, determining a first loss value of the predicted translation result relative to the label text, and adjusting model parameters of the recognition sub-model and the translation sub-model according to the first loss value.

In the embodiment of the application, a first loss value of a predicted translation result relative to the label text can be determined by adopting a cross entropy loss function, and when the first loss value is larger than a first threshold value, model parameters of the recognition sub-model and the translation sub-model are adjusted by adopting the first loss value, so that the recognition sub-model and the translation sub-model are converged. When the first penalty value is less than or equal to the first threshold value, a trained translation model may be obtained.

S309, extracting target translation fragments corresponding to the text fragments from the tag text.

Wherein the target translation fragment is one of a plurality of translation fragments. In the embodiment of the application, the target translation segment corresponding to the text segment can be extracted from the tag text based on the preset glossary and the text segment.

Illustratively, the sample text is "The virus enters the airway cells and causes pathological changes", the text fragment is "air way", the tag text is "virus enters respiratory tract cells and causes pathological changes", and the target translation fragment is "respiratory tract".

S310, determining a second loss value of the predicted translated segment relative to the target translated segment, and adjusting model parameters of the recognition sub-model according to the second loss value.

In the embodiment of the application, a second loss value of the predicted translation fragment relative to the target translation fragment can be determined by adopting a cross entropy loss function, and when the second loss value is larger than a second threshold value, model parameters of the recognition sub-model are adjusted by adopting the second loss value, so that the recognition sub-model is converged. And when the second loss value is smaller than or equal to the second threshold value, obtaining the recognition sub-model after training.

Further, the model training method can be applied to a cloud server, and specific implementation processes refer to the above content and are not described herein.

Referring to fig. 4, a schematic diagram of a translation model of the present application is illustrated, and the training process of the translation model is specifically: training data (sample text, text fragment, multiple translation fragments, and tag text) is first constructed in parallel with corpus < sample text, tag text > and preset glossary. And extracting the term constraint of one-to-one form < text fragment, target translation fragment > from the parallel corpus by adopting a preset glossary. And training the translation sub-model by adopting the constructed training data until the translation sub-model converges. And then determining a plurality of translation fragments corresponding to the text fragments by adopting a preset glossary. The recognition sub-model is then trained (of sample text, text segment, multiple translation segments, and target translation segment) until the recognition sub-model converges so that the recognition sub-model can achieve the disambiguation effect.

In addition, the application also provides a translation method which comprises the following steps: acquiring a text to be translated; extracting a target text fragment in a text to be translated, and acquiring a plurality of translation fragments corresponding to the target text fragment; inputting a text to be translated, a target text fragment and a plurality of translation fragments into a translation model to obtain a translation text corresponding to the text to be translated, wherein the text to be translated is a text in a first language, the translation fragment and the translation text are texts in a second language, and the translation model is obtained according to the model training method of any one of the above.

Further, extracting a target text segment in the text to be translated, and obtaining a plurality of translation segments corresponding to the target text segment includes: acquiring a preset glossary, wherein the preset glossary comprises a plurality of terms and a plurality of translation fragments corresponding to the terms; extracting a target text fragment from a text to be translated based on a preset glossary, wherein the target text fragment is a term in the preset glossary; and determining a plurality of translation fragments corresponding to the target text fragment in a preset glossary.

In addition, the application also provides a translation method, which is applied to the terminal equipment and comprises the following steps: acquiring a text to be translated, and sending the text to be translated to a server; and receiving the translation text of the text to be translated, which is sent by the server, wherein the translation text is obtained according to the translation method.

In summary, in the use process of the trained translation model, a text to be translated and a preset term library are given, a target text segment (term) contained in the text to be translated is firstly queried in a retrieval mode, a plurality of translation segments of the term are queried by means of the preset term library, and then the text to be translated, the target text segment and the plurality of translation segments are input into a recognition sub-model to be disambiguated, so that an actual translation segment is obtained, and the actual translation segment is one of the plurality of translation segments. And finally, inputting the actual translation fragment, the text to be translated and the target text sheet into a translation sub-model for translation to obtain the translation text of the text to be translated.

The specific implementation process of the embodiment of the present application refers to the above embodiment, and is not described herein.

In addition to providing a model training method, in an embodiment of the present application, a model training apparatus is provided, as shown in fig. 6, where the model training apparatus 60 includes:

a first obtaining module 61, configured to obtain a sample text, where the sample text is a text in a first language;

the extracting module 62 is configured to extract a text segment in the sample text, and obtain a plurality of translation segments corresponding to the text segment, where the translation segments are translation results of the text segment, and the translation segments are text in the second language;

A second obtaining module 63, configured to obtain a tag text corresponding to the sample text, where the tag text is a translation result of the sample text, and the tag text is a text in a second language;

training module 64 is configured to train a translation model using the sample text, the text segment, the plurality of translation segments, and the tag text, the sample text, the text segment, and the plurality of translation segments being inputs to the translation model.

In an alternative embodiment, the translation model includes: the recognition sub-model and the translation sub-model, training module 64, is specifically configured to: inputting the sample text, the text fragments and the plurality of translation fragments into an identification sub-model, and identifying the identification sub-model in the plurality of translation fragments based on the sample text to obtain a predicted translation fragment corresponding to the text fragment, wherein the predicted translation fragment is one of the plurality of translation fragments; inputting the sample text, the text fragment and the predictive translation fragment into a translation sub-model, and translating the sample text by the translation sub-model based on the text fragment and the predictive translation fragment to obtain a predictive translation result corresponding to the sample text; and determining a first loss value of the predicted translation result relative to the label text, and adjusting model parameters of the recognition sub-model and the translation sub-model according to the first loss value.

In an alternative embodiment, training module 64 is also specifically configured to: extracting a target translation fragment corresponding to the text fragment from the tag text, wherein the target translation fragment is one of a plurality of translation fragments; and determining a second loss value of the predicted translation fragment relative to the target translation fragment, and adjusting model parameters of the identification sub-model according to the second loss value.

In an alternative embodiment, identifying the submodel includes: the training module 64 is specifically configured to, when the sample text, the text segment, and the plurality of translation segments are input into the recognition sub-model, and the recognition sub-model recognizes, based on the sample text, a predicted translation segment corresponding to the text segment among the plurality of translation segments: splicing the text fragments and the sample text to obtain a spliced text, and inputting the spliced text into a context encoder for encoding to obtain a first encoding vector; inputting the translation fragments into a text encoder for encoding aiming at the translation fragments in the plurality of translation fragments to obtain second encoding vectors corresponding to the translation fragments; determining, for a second encoding vector of the plurality of second encoding vectors, a similarity of the second encoding vector and the first encoding vector; and determining the translation segment corresponding to the second coding vector with the maximum similarity as a predicted translation segment.

In an alternative embodiment, the extracting module 62 is specifically configured to obtain a preset glossary, where the preset glossary includes a plurality of terms and a plurality of translation segments corresponding to the terms; extracting text fragments from the sample text based on a preset glossary, wherein the text fragments are terms in the preset glossary; and determining a plurality of translation fragments corresponding to the text fragments in a preset glossary.

The specific implementation process of the model training device provided in the embodiment of the present application refers to the above method embodiment, and is not described herein again.

In an embodiment of the present application, there is provided a translation apparatus (not shown) including:

the acquisition module is used for acquiring the text to be translated;

the extraction module is used for extracting target text fragments in the text to be translated and acquiring a plurality of translation fragments corresponding to the target text fragments;

the translation module is used for inputting the text to be translated, the target text fragment and the plurality of translation fragments into a translation model to obtain a translation text corresponding to the text to be translated, wherein the text to be translated is a text in a first language, the translation fragment and the translation text are texts in a second language, and the translation model is obtained according to the model training method.

In an alternative embodiment, the extraction module is specifically configured to: acquiring a preset glossary, wherein the preset glossary comprises a plurality of terms and a plurality of translation fragments corresponding to the terms; extracting a target text fragment from a text to be translated based on a preset glossary, wherein the target text fragment is a term in the preset glossary; and determining a plurality of translation fragments corresponding to the target text fragment in a preset glossary.

The specific implementation process of the translation device provided in the embodiment of the present application refers to the above method embodiment, and is not described herein again.

In addition, in some of the flows described in the above embodiments and the drawings, a plurality of operations appearing in a particular order are included, but it should be clearly understood that the operations may be performed out of order or performed in parallel in the order in which they appear herein, merely for distinguishing between the various operations, and the sequence number itself does not represent any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first" and "second" herein are used to distinguish different messages, devices, modules, etc., and do not represent a sequence, and are not limited to the "first" and the "second" being different types.

Fig. 7 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application. The electronic device is used for running the model training method and the translation method. As shown in fig. 7, the electronic device includes: a memory 74 and a processor 75.

Memory 74 is used to store computer programs and may be configured to store various other data to support operations on the electronic device. The memory 74 may be an object store (Object Storage Service, OSS).

The memory 74 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

A processor 75 coupled to the memory 74 for executing the computer program in the memory 74 for: acquiring a sample text, wherein the sample text is a text of a first language; extracting text fragments in the sample text, and acquiring a plurality of translation fragments corresponding to the text fragments, wherein the translation fragments are translation results of the text fragments, and the translation fragments are texts in a second language; acquiring a label text corresponding to the sample text, wherein the label text is a translation result of the sample text, and the label text is a text of a second language; the translation model is trained using the sample text, the text segment, the plurality of translation segments, and the tag text, the sample text, the text segment, and the plurality of translation segments being inputs to the translation model.

Further optionally, the translation model includes: recognition sub-model and translation sub-model, processor 75 is specifically configured to, when training a translation model using sample text, text segments, a plurality of translation segments, and tag text: inputting the sample text, the text fragments and the plurality of translation fragments into an identification sub-model, and identifying the identification sub-model in the plurality of translation fragments based on the sample text to obtain a predicted translation fragment corresponding to the text fragment, wherein the predicted translation fragment is one of the plurality of translation fragments; inputting the sample text, the text fragment and the predictive translation fragment into a translation sub-model, and translating the sample text by the translation sub-model based on the text fragment and the predictive translation fragment to obtain a predictive translation result corresponding to the sample text; and determining a first loss value of the predicted translation result relative to the label text, and adjusting model parameters of the recognition sub-model and the translation sub-model according to the first loss value.

Further optionally, the processor 75 is further specifically configured to: extracting a target translation fragment corresponding to the text fragment from the tag text, wherein the target translation fragment is one of a plurality of translation fragments; and determining a second loss value of the predicted translation fragment relative to the target translation fragment, and adjusting model parameters of the identification sub-model according to the second loss value.

Further optionally, identifying the submodel includes: the text encoder and the context encoder, the processor 75 is specifically configured to, when inputting the sample text, the text segment, and the plurality of translation segments into the recognition sub-model, recognize the predicted translation segment corresponding to the text segment among the plurality of translation segments based on the sample text: splicing the text fragments and the sample text to obtain a spliced text, and inputting the spliced text into a context encoder for encoding to obtain a first encoding vector; inputting the translation fragments into a text encoder for encoding aiming at the translation fragments in the plurality of translation fragments to obtain second encoding vectors corresponding to the translation fragments; determining, for a second encoding vector of the plurality of second encoding vectors, a similarity of the second encoding vector and the first encoding vector; and determining the translation segment corresponding to the second coding vector with the maximum similarity as a predicted translation segment.

Further optionally, the processor 75 is specifically configured to obtain a preset glossary, where the preset glossary includes a plurality of terms and a plurality of translation segments corresponding to the terms, when extracting text segments in the sample text and obtaining a plurality of translation segments corresponding to the text segments; extracting text fragments from the sample text based on a preset glossary, wherein the text fragments are terms in the preset glossary; and determining a plurality of translation fragments corresponding to the text fragments in a preset glossary.

In an alternative embodiment, processor 75 is coupled to memory 74 for executing a computer program in memory 74 for further: acquiring a text to be translated; extracting a target text fragment in a text to be translated, and acquiring a plurality of translation fragments corresponding to the target text fragment; inputting a text to be translated, a target text fragment and a plurality of translation fragments into a translation model to obtain a translation text corresponding to the text to be translated, wherein the text to be translated is a text in a first language, the translation fragment and the translation text are texts in a second language, and the translation model is obtained according to the model training method of any one of the above.

Further optionally, the processor 75 is specifically configured to obtain a preset glossary, where the preset glossary includes a plurality of terms and a plurality of translation segments corresponding to the terms, when extracting a target text segment in the text to be translated and obtaining a plurality of translation segments corresponding to the target text segment; extracting a target text fragment from a text to be translated based on a preset glossary, wherein the target text fragment is a term in the preset glossary; and determining a plurality of translation fragments corresponding to the target text fragment in a preset glossary.

In an alternative embodiment, processor 75 is coupled to memory 74 for executing a computer program in memory 74 for further: acquiring a text to be translated, and sending the text to be translated to a server; and receiving the translation text of the text to be translated, which is sent by the server, wherein the translation text is obtained according to the translation method.

Further, as shown in fig. 7, the electronic device further includes: firewall 71, load balancer 72, communication component 76, power component 78, and other components. Only some of the components are schematically shown in fig. 7, which does not mean that the electronic device only comprises the components shown in fig. 7.

Accordingly, embodiments of the present application also provide a computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to implement the steps in the methods shown above.

Accordingly, embodiments of the present application also provide a computer program product comprising a computer program/instructions which, when executed by a processor, cause the processor to carry out the steps of the method shown above.

The communication assembly of fig. 7 is configured to facilitate wired or wireless communication between the device in which the communication assembly is located and other devices. The device where the communication component is located can access a wireless network based on a communication standard, such as a mobile communication network of WiFi,2G, 3G, 4G/LTE, 5G, etc., or a combination thereof. In one exemplary embodiment, the communication component receives a broadcast signal or broadcast-related text from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.

The power supply assembly shown in fig. 7 provides power to various components of the device in which the power supply assembly is located. The power components may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the devices in which the power components are located.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs and/or GPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both permanent and non-permanent, removable and non-removable media, may implement text storage by any method or technology. The text may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store text that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises an element.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims

1. A method of model training, comprising:

acquiring a sample text, wherein the sample text is a text in a first language;

extracting text fragments in a sample text, and acquiring a plurality of translation fragments corresponding to the text fragments, wherein the translation fragments are translation results of the text fragments, and the translation fragments are texts in a second language;

acquiring a label text corresponding to a sample text, wherein the label text is a translation result of the sample text, and the label text is a text of the second language;

training a translation model by using the sample text, the text segment, the plurality of translation segments and the tag text, wherein the sample text, the text segment and the plurality of translation segments are inputs of the translation model.

2. The model training method of claim 1, wherein the translation model comprises: identifying a sub-model and a translation sub-model, said training a translation model using the sample text, the text segment, the plurality of translation segments, and the tag text, comprising:

Inputting the sample text, the text fragments and the plurality of translation fragments into the recognition sub-model, wherein the recognition sub-model recognizes and obtains a predicted translation fragment corresponding to the text fragment in the plurality of translation fragments based on the sample text, and the predicted translation fragment is one of the plurality of translation fragments;

inputting the sample text, the text fragment and the predicted translation fragment into the translation sub-model, and translating the sample text by the translation sub-model based on the text fragment and the predicted translation fragment to obtain a predicted translation result corresponding to the sample text;

and determining a first loss value of the predicted translation result relative to the tag text, and adjusting model parameters of the recognition sub-model and the translation sub-model according to the first loss value.

3. The model training method of claim 2, further comprising:

extracting a target translation fragment corresponding to the text fragment from the tag text, wherein the target translation fragment is one of the translation fragments;

and determining a second loss value of the predicted translation fragment relative to the target translation fragment, and adjusting model parameters of the recognition sub-model according to the second loss value.

4. A model training method as claimed in claim 2 or 3, wherein the recognition sub-model comprises: the text encoder and the context encoder input the sample text, the text segment and the plurality of translation segments into the recognition sub-model, the recognition sub-model recognizes and obtains a predicted translation segment corresponding to the text segment in the plurality of translation segments based on the sample text, and the method comprises the following steps:

splicing the text segment and the sample text to obtain a spliced text, and inputting the spliced text into the context encoder for encoding to obtain a first encoding vector;

inputting the translation fragments into the text encoder for encoding aiming at the translation fragments in the plurality of translation fragments to obtain a second encoding vector corresponding to the translation fragments;

determining, for a second encoding vector of a plurality of second encoding vectors, a similarity of the second encoding vector and the first encoding vector;

and determining a translation segment corresponding to the second coding vector with the maximum similarity as the predicted translation segment.

5. A method of training a model according to any one of claims 1 to 3, wherein the extracting text segments in the sample text and obtaining a plurality of translated segments corresponding to the text segments comprises:

Acquiring a preset glossary, wherein the preset glossary comprises a plurality of terms and a plurality of translation fragments corresponding to the terms;

extracting the text segment from the sample text based on the preset glossary, wherein the text segment is a term in the preset glossary;

and determining a plurality of translation fragments corresponding to the text fragments in the preset glossary.

6. A method of translation, comprising:

acquiring a text to be translated;

extracting a target text segment in the text to be translated, and acquiring a plurality of translation segments corresponding to the target text segment;

inputting the text to be translated, the target text segment and the plurality of translation segments into a translation model to obtain a translation text corresponding to the text to be translated, wherein the text to be translated is a text in a first language, the translation segment and the translation text are texts in a second language, and the translation model is obtained according to the model training method of any one of claims 1 to 5.

7. The method according to claim 6, wherein the extracting the target text segment in the text to be translated and obtaining a plurality of translated segments corresponding to the target text segment includes:

extracting the target text segment from the text to be translated based on the preset glossary, wherein the target text segment is a term in the preset glossary;

and determining a plurality of translation fragments corresponding to the target text fragment in the preset glossary.

8. A translation method, applied to a terminal device, comprising:

acquiring a text to be translated, and sending the text to be translated to the server;

receiving a translation text of the text to be translated, which is sent by the server, wherein the translation text is obtained according to the translation method of claim 6 or 7.

9. The model training method is characterized by being applied to a cloud server, and comprises the following steps:

acquiring a sample text, wherein the sample text is a text in a first language;

10. A translation system, comprising:

the terminal equipment is used for acquiring a text to be translated and sending the text to be translated to the cloud server;

the cloud server is used for acquiring a text to be translated; extracting a target text segment in the text to be translated, and acquiring a plurality of translation segments corresponding to the target text segment; inputting the text to be translated, the target text segment and the plurality of translation segments into a translation model to obtain a translation text corresponding to the text to be translated, wherein the text to be translated is a text in a first language, the translation segment and the translation text are texts in a second language, and the translation model is obtained according to the model training method of any one of claims 1 to 5;

11. An electronic device, comprising: a processor, a memory and a computer program stored on the memory and executable on the processor, the processor implementing the model training method of any of claims 1 to 5, or claim 9, and/or the translation method of any of claims 6 to 8 when the computer program is executed.