CN109960814B

CN109960814B - Model parameter searching method and device

Info

Publication number: CN109960814B
Application number: CN201910227374.XA
Authority: CN
Inventors: 李长亮; 李小龙; 唐剑波; 王勇博
Original assignee: Chengdu Kingsoft Interactive Entertainment Technology Co ltd; Beijing Kingsoft Digital Entertainment Co Ltd
Current assignee: Chengdu Kingsoft Interactive Entertainment Technology Co ltd; Beijing Kingsoft Digital Entertainment Co Ltd
Priority date: 2019-03-25
Filing date: 2019-03-25
Publication date: 2023-09-29
Anticipated expiration: 2039-03-25
Also published as: CN109960814A

Abstract

The application provides a model parameter searching method and a device, wherein the model parameter searching method comprises the following steps: obtaining respective translations which are output after at least two translation models translate the corpus in the corpus, and the translation probability of each translation sentence in the translations; searching weight parameter sets corresponding to the at least two translation models in a parameter space based on the translation and the translation probability of each translation sentence in the translation; and respectively taking the weight parameters contained in the searched target weight parameter set as the target weight parameters of the at least two translation models. According to the model parameter searching method provided by the application, the corpus in the corpus library and the translation probability of the corpus translated by the translation model are combined to search the weight parameters of the translation model in the parameter space, so that the parameter searching efficiency is improved, the translation accuracy of the translation model of the target weight parameters obtained by the application of the search is higher, and a more accurate translation result is obtained.

Description

Model parameter searching method and device

Technical Field

The application relates to the technical field of machine translation, in particular to a model parameter searching method. The application also relates to a model parameter searching device, a computing device and a computer readable storage medium.

Background

The natural language processing is a research on various theories and methods for realizing effective communication between people and computers by using natural language, and with the rapid development of natural language processing, machine translation serving as a traditional branch of computational linguistics is also widely focused, the machine translation is also called automatic translation, a process of converting one natural language (source language) into another natural language (target language) by using a computer, and the machine translation is one of the final targets of artificial intelligence, has important practical value, and plays an increasingly important role in promoting politics, economy, culture communication and the like.

At present, an important implementation manner of machine translation is to establish a machine translation model, input the content to be translated into a plurality of machine translation models which are established and trained in advance, translate the translation results output by the plurality of machine translation models according to different algorithms respectively, evaluate the translation result of each machine translation model by a certain means, and take the translation result with the best evaluation as the translated text of the content to be translated. However, in the process of determining the translation in the translation results of the machine translation model, the loss of consideration of the machine translation model is insufficient, the loss of the translation results obtained by translating different contents by the machine translation model adopting different algorithms cannot be reflected fully, and the accuracy of the finally obtained translation is low.

Disclosure of Invention

In view of the above, the embodiment of the application provides a model parameter searching method to solve the technical defects existing in the prior art. The embodiment of the application also provides a model parameter searching device, a computing device and a computer readable storage medium.

The application provides a model parameter searching method, which comprises the following steps:

obtaining respective translations which are output after at least two translation models translate the corpus in the corpus, and the translation probability of each translation sentence in the translations;

searching weight parameter sets corresponding to the at least two translation models in a parameter space based on the translation and the translation probability of each translation sentence in the translation;

and respectively taking the weight parameters contained in the searched target weight parameter set as the target weight parameters of the at least two translation models.

Optionally, the searching the weight parameter set corresponding to the at least two translation models in the parameter space based on the translation and the translation probability of each translation sentence in the translation includes:

constructing a search tree based on the translation and the translation probability of each translation sentence in the translation; the weight parameter sets in the parameter space are in one-to-one correspondence with the search nodes in the search tree;

And searching weight parameter sets corresponding to the at least two translation models in the parameter space according to the search tree.

Optionally, in the executing step of constructing a search tree based on the translation and the translation probability of each translation sentence in the translation, the following operations are executed for the search node in the search tree corresponding to the weight parameter set in the parameter space:

according to the weight parameter set in the parameter space corresponding to the search node, taking the weight parameter set as the weight parameter of the at least two translation models, and calculating the heuristic cost of the search node by combining the translation probability of each translation sentence in the translation;

the heuristic cost of the search node is obtained by calculating the heuristic cost of each translation model in the at least two translation models, wherein the heuristic cost of the model of each translation model is the sum of the weight parameter of the translation model and the product of the translation probability of each translation sentence in the translation.

Optionally, the lower layer search node of any one search node in the search tree is determined by adopting the following manner:

determining a search node set of adjacent search nodes adjacent to the search nodes in the search tree and having a connection relationship by adopting a Gaussian algorithm;

And selecting at least one adjacent search node with the highest heuristic cost from the search node set as a lower-layer search node of the search nodes according to the calculated heuristic cost of each adjacent search node in the search node set.

Optionally, after the step of determining, by using a gaussian algorithm, a search node set of adjacent search nodes adjacent to the search node and having a connection relationship in the search tree is performed, and before the step of performing, by using at least one adjacent search node with the highest heuristic cost in the search node set as a lower-layer search node of the search node, the heuristic cost of each adjacent search node in the search node set obtained by calculation includes:

aiming at the searching nodes in the searching node set, taking the weight parameter set as the weight parameters of the at least two translation models according to the weight parameter set in the parameter space corresponding to the searching nodes;

based on the weight parameters of the at least two translation models, adopting reordering to fuse text translations output by the at least two translation models respectively to obtain a reference text translation of the text to be translated;

Comparing the reference text translation with the real translation of the corpus, and determining the translation accuracy and/or the translation loss of the reference text translation relative to the real translation;

judging whether the translation accuracy and/or the translation loss is greater than the translation accuracy and/or the translation loss corresponding to the upper layer search node of the search node;

and if the search node is not larger than the set of the search nodes, eliminating the search node from the set of the search nodes to which the search node belongs.

Optionally, the step of searching the weight parameter sets corresponding to the at least two translation models in the parameter space based on the translation and the translation probability of each translation sentence in the translation is implemented based on a cluster search algorithm.

Optionally, after the step of executing the weight parameters included in the searched target weight parameter set as the target weight parameters of the at least two translation models, the method includes:

respectively inputting the text to be translated into the at least two translation models for text translation, and respectively outputting text translations aiming at the text to be translated;

and fusing the text translations output by the at least two translation models by adopting reordering to obtain the optimal text translation of the text to be translated.

Optionally, the text translation output by each of the at least two translation models is composed of at least one text translation sentence, and the translation sentences in the text translation sentence correspond to the to-be-translated sentences in the to-be-translated text respectively.

Optionally, the fusing the text translations output by the at least two translation models by reordering, to obtain an optimal text translation of the text to be translated, includes:

aiming at each text to be translated in the text to be translated, selecting an optimal text translation sentence in a text translation sentence set corresponding to the text to be translated according to the target weight parameters of the at least two translation models; the text translation sentence set consists of at least two text translation sentences corresponding to the to-be-translated sentences in the text translation text output by the at least two translation models;

and according to the optimal text translation sentences corresponding to all the sentences to be translated in the text to be translated, merging the optimal text translation sentences corresponding to all the sentences to be translated in the text to be translated into the optimal text translation.

Optionally, for each sentence to be translated in the text to be translated, selecting an optimal text translation sentence in the text translation sentence set corresponding to the sentence to be translated according to the target weight parameters of the at least two translation models, including:

Calculating the translation evaluation score of each text translation sentence in the text translation sentence set corresponding to the text translation sentence to be translated according to the target weight parameters of the at least two translation models and the translation probability of each text translation sentence in the text to be translated output by the at least two translation models;

and selecting the text translation sentence with the highest translation evaluation score from the text translation sentence set corresponding to the sentence to be translated as the optimal text translation sentence of the sentence to be translated.

The application also provides a model parameter searching device, which comprises:

the corpus translation module is configured to acquire respective translations which are output after the at least two translation models translate the corpus in the corpus, and the translation probability of each translation sentence in the translations;

the weight parameter set searching module is configured to search weight parameter sets corresponding to the at least two translation models in a parameter space based on the translation and the translation probability of each translation sentence in the translation;

and the target weight parameter determining module is configured to respectively take the weight parameters contained in the searched target weight parameter set as the target weight parameters of the at least two translation models.

Optionally, the weight parameter set searching module includes:

A search tree construction sub-module configured to construct a search tree based on the translation and a translation probability for each of the translations; the weight parameter sets in the parameter space are in one-to-one correspondence with the search nodes in the search tree;

and the searching sub-module is configured to search the weight parameter sets corresponding to the at least two translation models in the parameter space according to the search tree.

Optionally, the model parameter searching device includes:

the translation module is configured to input texts to be translated into the at least two translation models respectively to carry out text translation and output text translations aiming at the texts to be translated respectively;

and the translation fusion module is configured to fuse the text translations output by the at least two translation models respectively by adopting reordering to obtain the optimal text translation of the text to be translated.

The present application also provides a computing device comprising:

a memory and a processor;

the memory is used for storing computer executable instructions, and the processor implements the steps of the model parameter searching method when executing the computer executable instructions.

The present application also provides a computer readable storage medium storing computer instructions that when executed by a processor implement the steps of the model parameter searching method.

Compared with the prior art, the application has the following advantages:

the application provides a model parameter searching method, which comprises the following steps: obtaining respective translations which are output after at least two translation models translate the corpus in the corpus, and the translation probability of each translation sentence in the translations; searching weight parameter sets corresponding to the at least two translation models in a parameter space based on the translation and the translation probability of each translation sentence in the translation; and respectively taking the weight parameters contained in the searched target weight parameter set as the target weight parameters of the at least two translation models.

According to the model parameter searching method, in the process of searching the target weight parameters required by the translation task by the translation model in the parameter space, the corpus in the corpus library and the translation probability of the corpus translated by the translation model are combined, the weight parameters of at least two translation models are searched in the parameter space, so that the searching efficiency is improved, and on the basis of the target weight parameters required by the translation task respectively executed by the at least two translation models obtained by searching, the translation accuracy of the translation model applying the target weight parameters in the process of executing the translation task is higher, and a more accurate translation result is obtained.

Drawings

FIG. 1 is a process flow diagram of a model parameter searching method provided by an embodiment of the application;

FIG. 2 is a schematic diagram of a search tree provided by an embodiment of the present application;

FIG. 3 is a schematic diagram of a model parameter searching apparatus according to an embodiment of the present application;

FIG. 4 is a block diagram of a computing device provided by an embodiment of the application.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. The present application may be embodied in many other forms than those herein described, and those skilled in the art will readily appreciate that the present application may be similarly embodied without departing from the spirit or essential characteristics thereof, and therefore the present application is not limited to the specific embodiments disclosed below.

The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.

The application provides a model parameter searching method, a model parameter searching device, a computing device and a computer readable storage medium. The following detailed description is given, one by one, with reference to the accompanying drawings of the embodiments provided by the present application, and the respective steps of the method are described.

The embodiment of the application provides a model parameter searching method, which comprises the following steps:

referring to fig. 1, a process flow diagram of a model parameter searching method provided in this embodiment is shown, and referring to fig. 2, a schematic diagram of a search tree provided in this embodiment is shown.

Step S102, obtaining respective translations which are output after at least two translation models translate the corpus in the corpus, and the translation probability of each translation sentence in the translations.

In a machine translation task, in order to improve the accuracy of translation, a plurality of (two or more) translation models are often adopted to translate a text to be translated, and then the best translation result is selected from the translation results of the text to be translated in the plurality of translation models to be used as the translation output of the text to be translated. However, in practical application, the multiple translation models may adopt different translation architectures or translation algorithms, and in the process of translating the text, the translation effect of translating the text of different types may be different, so that in order to further improve the translation accuracy, the translation results of the different translation models may be fused, for example, for the sentence (to-be-translated sentence) included in the text to be translated, a translation sentence with the best translation effect for each to-be-translated sentence in the text to be translated is selected from the translation results of the different translation models, and then fusion is performed on the translation sentences of all to-be-translated sentences in the text to be translated, so that a translation with more accurate translation of the text to be translated is obtained after fusion.

In the implementation, in the process of fusing the translations of the sentences to be translated in the text to be translated, corresponding weight parameters are required to be set for a plurality of translation models respectively, the translation effect of each translation model for each sentence to be translated in the text to be translated is quantitatively evaluated on the basis of the weight parameters of the translation models, so that the translation sentence with the best translation effect for each sentence to be translated in the text to be translated is selected, and finally the fusion is performed on the basis of the translation sentence with the best translation effect for each selected sentence to be translated, so that a more accurate translation result is obtained.

It should be noted that, in the process of setting the weight parameters of each of the plurality of translation models, the setting of the weight parameters is not circulated, so that a specific value or a value range of the weight parameters cannot be determined, that is: the range of the weight parameters is not limited, and the parameter space of the weight parameters tends to infinity, so how to determine the optimal weight parameters of each of the plurality of translation models in the infinite parameter space or determine the target weight parameters of the plurality of translation models close to the optimal weight parameters becomes the greatest difficulty in the process of fusing the translation results of the plurality of translation models to obtain more accurate translation results.

According to the model parameter searching method, in the process of setting the weight parameters of each of the plurality of translation models, on the basis of the translation results of translating the corpus in the corpus by combining the plurality of translation models, the translation effects of the plurality of translation models on the corpus are referred, the weight parameters of the plurality of translation models are searched in the parameter space with infinite values, and after the searched weight parameters are applied to the corresponding translation models, the accuracy of translations obtained after the translation results of the translation models are fused on the basis of the weight parameters of the searched translation models is higher.

In the implementation, the corpus in the corpus is respectively input into each translation model in a plurality of translation models, and each translation model outputs a translation corresponding to the corpus in the corpus after translation. Specifically, each sentence of the corpus has a corresponding translation sentence in the translation obtained after translation of each translation model, and the translation sentence in the translation obtained after translation of each translation model has a corresponding relation with each sentence of the corpus.

It should be noted that, in the process of translating the language material to output the corresponding translation, each translation model also outputs the translation probability of each translation sentence in the corresponding translation. The translation probability of each translation sentence in the embodiment of the present application may be a value representing the translation accuracy of the translation model for the translation sentence, or may be a value related to the translation accuracy of other translation models for the translation sentence, for example, a value representing the translation loss of the translation sentence by the translation model, which is not limited.

For example, 4000 sentences of the corpus to be translated are respectively input into 3 translation models: translation Model 1, translation Model 2, and translation Model 3;

after the translation Model 1 translates the input 4000-sentence corpus to be translated, outputting a translation t1 corresponding to the 4000-sentence corpus to be translated, wherein the translation t1 also comprises 4000-sentence translations, and the 4000-sentence translations are respectively in one-to-one correspondence with the 4000-sentence corpus to be translated in the corpus; meanwhile, after the translation Model 1 translates the input 4000 sentences of the corpus to be translated, the translation probability scoring of each 4000 sentences of the translation t1 is also output;

similarly, after the translation Model 2 translates the input 4000-sentence corpus to be translated, outputting a translation t2 corresponding to the 4000-sentence corpus to be translated, wherein the translation t2 also comprises 4000-sentence translations, and the 4000-sentence translations are respectively in one-to-one correspondence with the 4000-sentence corpus to be translated in the corpus; meanwhile, after the translation Model 2 translates the input 4000-sentence corpus to be translated, the translation probability scoring of each 4000-sentence translation in the translation t2 is also output;

after the translation Model 3 translates the 4000 sentences to be translated, outputting translations t3 corresponding to the 4000 sentences to be translated, wherein the translations t3 also comprise 4000 sentences, and the 4000 sentences correspond to the 4000 sentences to be translated in the corpus one by one; meanwhile, after the translation Model 3 translates the input 4000-sentence corpus to be translated, the translation probability scoring of each 4000-sentence translation in the translation t3 is also output.

Step S104, searching weight parameter sets corresponding to the at least two translation models in a parameter space based on the translation and the translation probability of each translation sentence in the translation.

After the corpus in the corpus is obtained and the corpus is output to the respective translations which are output after translation is carried out on the at least two translation models and the translation probability of each translation in the translations, the weight parameters of the at least two translation models are searched in the same parameter space according to the obtained translations and the translation probability of each translation in the translations;

in the embodiment of the application, the weight parameters in the parameter space exist in the form of weight parameter groups, and the number of the weight parameters contained in each weight parameter group is consistent with the number of the translation models in the at least two translation models, so that the weight parameter groups finally searched in the parameter search space can be applied to the at least two translation models.

In the embodiment of the present application, based on the translation and the translation probability of each translation sentence in the translation, the set of weight parameters corresponding to the at least two translation models is searched in the parameter space, preferably implemented in the following manner:

1) And constructing a search tree based on the translation and the translation probability of each translation sentence in the translation, wherein the weight parameter set in the parameter space corresponds to the search nodes in the search tree one by one.

In the implementation process, for each search node corresponding to the weight parameter set in the parameter space, preferably, the weight parameter set is used as the weight parameter of the at least two translation models according to the weight parameter set in the parameter space corresponding to the search node, and the heuristic cost of the search node is calculated by combining the translation probability of each translation sentence in the translation; the heuristic cost of the search node is obtained by calculating the heuristic cost of each translation model in the at least two translation models, wherein the heuristic cost of the model of each translation model is the sum of the weight parameter of the translation model and the product of the translation probability of each translation sentence in the translation.

Further, in the process of constructing the search tree, the lower-layer search node of any one search node in the search tree is determined by adopting the following manner:

a) Determining a search node set of adjacent search nodes adjacent to the search nodes in the search tree and having a connection relationship by adopting a Gaussian algorithm;

b) And selecting at least one adjacent search node with the highest heuristic cost from the search node set as a lower-layer search node of the search nodes according to the calculated heuristic cost of each adjacent search node in the search node set.

Furthermore, after determining the search node set of the adjacent search nodes adjacent to the search node and having a connection relationship in the search tree, and before selecting at least one adjacent search node with highest heuristic cost in the search node set as the lower search node of the search nodes, in order to further improve the search efficiency, the search nodes in the search node set may be filtered, the number of search nodes in the search node set may be reduced by a reject method, and the processing efficiency of selecting the lower search node of the search nodes on the basis of the search node set may be improved, thereby improving the search efficiency.

For example, the search tree shown in fig. 2 is a search tree for searching the weight parameter sets corresponding to the translation Model 1, the translation Model 2 and the translation Model 3 in the parameter space by adopting a bundle search algorithm, and each weight parameter set in the parameter space has a corresponding relationship with a search node in the search tree, and the specific construction process of the search tree is as follows:

In the process of constructing the search tree, an initial search node is firstly required to be determined, and the initial search node can be randomly determined by adopting a random algorithm or a specific search node is designated as the initial search node before searching;

after determining the initial search node, starting from the initial search node, determining the search node of the next layer of the initial search node, wherein the number of nodes of the next layer is consistent with the Beam Width (Beam Width) of the Beam search algorithm, and the Beam Width is set to be 2; specifically, in the process of searching nodes of the next layer of initial searching nodes, firstly, calculating heuristic costs of all the searching nodes of the next layer, secondly, sorting all the searching nodes of the next layer according to a sorting order from high heuristic costs to low heuristic costs, and thirdly, selecting 2 searching nodes with highest heuristic costs from all the searching nodes of the next layer after sorting, namely, the searching nodes of the next layer of initial searching nodes, wherein the searching nodes are shown as a searching node Top1 and a searching node Top2 in the figure 2;

the heuristic cost of the search node Top1 is the sum of the Model heuristic costs of the translation Model 1, the translation Model 2 and the translation Model 3;

specifically, the weight parameters of the translation Model 1 in the weight parameter set corresponding to the search node Top1 are multiplied by the translation probability scores of the 4000 translation sentences in the translation t1 respectively, and 4000 products are summed to obtain the Model heuristic cost h1 of the translation Model 1; similarly, multiplying the weight parameter of the translation Model 2 in the weight parameter set corresponding to the search node Top1 by the translation probability score of each of 4000 translation sentences in the translation t2, and summing up 4000 products to obtain the Model heuristic cost h2 of the translation Model 2; multiplying the weight parameter of the translation Model 3 in the weight parameter set corresponding to the search node Top1 by the translation probability score of each of 4000 translation sentences in the translation t3, and summing up 4000 products to obtain the Model heuristic cost h3 of the translation Model 3;

And summing the Model heuristic cost H1 of the translation Model 1, the Model heuristic cost H2 of the translation Model 2 and the Model heuristic cost H3 of the translation Model 3 to obtain the heuristic cost H1 of the search node Top 1; similarly, the heuristic cost of other search nodes in the search tree is obtained by adopting the heuristic cost calculation mode, and is not described in detail herein;

further, for the search node Top1 and the search node Top2 of the next layer of the start search node, similarly to the above-described processing of the start search node, 2 search nodes of the next layer of the search node Top1 are respectively determined: search node Top11 and search node Top12; and 2 search nodes of the next layer of search nodes Top 2: search node Top21 and search node Top22;

and so on, determining 4 search nodes of each subsequent layer, and constructing a search tree based on the determined search nodes.

In addition, in practical application, a breadth-first policy may be further used to construct the search tree, where in determining the search nodes in the search tree, a heuristic search algorithm may be further used to determine the search nodes in the search tree, where a purpose of the heuristic search algorithm is to selectively save the search nodes that can reach the target node, specifically, a heuristic cost of each search node is calculated by using the heuristic search algorithm, where the heuristic cost refers to a loss from a current search node to the target search node, then the search nodes of each layer are ordered according to the heuristic cost, finally, a preset number of search nodes with optimal heuristic cost are left in each layer, only the search nodes left in each layer are searched in a depth of the next layer, and other search nodes not left in each layer are rejected.

In a preferred implementation manner provided by the embodiment of the present application, filtering the search nodes in the search node set is implemented in the following manner:

a) Aiming at the searching nodes in the searching node set, taking the weight parameter set as the weight parameters of the at least two translation models according to the weight parameter set in the parameter space corresponding to the searching nodes;

b) Based on the weight parameters of the at least two translation models, adopting reordering to fuse text translations output by the at least two translation models respectively to obtain a reference text translation of the text to be translated;

c) Comparing the reference text translation with the real translation of the corpus, and determining the translation accuracy and/or the translation loss of the reference text translation relative to the real translation;

the translation accuracy refers to the accuracy of the reference text translation relative to the real translation, for example, the text similarity between the reference text translation and the real translation is 89% by calculating the similarity algorithm, and the translation accuracy of the reference text translation relative to the real translation is 89%.

The translation loss refers to a loss difference between the reference text translation and the real translation, for example, if the loss of the reference text translation relative to the real translation is calculated to be 0.11 through a loss function, the translation loss of the reference text translation relative to the real translation is 0.11.

d) Judging whether the translation accuracy and/or the translation loss is greater than the translation accuracy and/or the translation loss corresponding to the upper layer search node of the search node;

if the translation accuracy of the search node is higher than that of the upper search node, the search node can be considered to be searched more deeply, so that the search node is reserved in the search node set;

if the translation accuracy of the search node is not improved compared with that of the upper search node, the search node does not need to be searched more deeply, and the search node is removed from the search node set to which the search node belongs.

2) And searching weight parameter sets corresponding to the at least two translation models in the parameter space according to the search tree.

Along the above example, after the search tree is built, searching is performed according to the order of the search nodes in the search tree, specifically, in the searching process, the number of layers of the search tree may be specified in advance, and then, among the 4 search nodes in the last layer (the N-th layer corresponding to the search node TopN1, the search node TopN2, the search node TopN3, and the search node TopN 4), one search node with the highest heuristic cost (i.e., the search node TopN 1) is selected as the target search node for performing the weight parameter search in the parameter space, and the 3 weight parameters included in the weight parameter set corresponding to the search node TopN1 are the respective target weight parameters of the 3 translation models to be searched in the parameter space, i.e., the translation Model 1, the translation Model 2, and the translation Model 3.

In practical applications, the searching process of searching the weight parameter sets corresponding to the at least two translation models in the parameter space may be implemented by using a corresponding searching Algorithm, for example, a Greedy Algorithm (Greedy Algorithm) or a cluster searching Algorithm (Beam Search Algorithm) is used to search the weight parameter sets corresponding to the at least two translation models in the parameter space.

As described above, the parameter space tends to be infinite, and in the embodiment of the present application, considering that the parameter space tends to be infinite, when random sampling is used and searching is performed in the parameter space based on a greedy algorithm, a locally optimal solution is easily trapped, and the searching efficiency is low; in order to improve the searching efficiency of the searching process of searching the weight parameter sets corresponding to the at least two translation models in the parameter space, a cluster searching algorithm is preferably adopted, some searching nodes with poor quality are removed in the process of each step of deep searching according to the characteristics of the cluster searching algorithm, and some searching nodes with higher quality are reserved, so that the searching efficiency is improved.

And S106, taking the weight parameters contained in the searched target weight parameter set as the target weight parameters of the at least two translation models respectively.

And after searching the target weight parameter set corresponding to the target search node in the parameter space, respectively taking the weight parameters contained in the target parameter set as the target weight parameters of each of the at least two translation models. For example, the 3 weight parameters included in the weight parameter set corresponding to the target search node TopN1 searched in the parameter space are applied to the 3 translation models of the translation Model 1, the translation Model 2 and the translation Model 3, respectively, as the target weight parameters of the three.

In an embodiment of the present application, after determining the target weight parameters, the translation model may be applied to an actual translation task to translate a text to be translated in the actual translation task, so that the translation model performs more accurate translation based on the target weight parameters. Preferably, the text translation output by each of the at least two translation models is composed of at least one text translation sentence, and the translation sentences in the text translation sentence correspond to the to-be-translated sentences in the to-be-translated text respectively. Therefore, in the translation process, the translation results of at least two translation models are fused by taking sentences as units, and the accuracy of the finally obtained translation results is higher.

Specifically, in the process of fusing the text translations output by each of the at least two translation models by reordering, the text translations output by each of the translation models are preferably fused in the following manner to obtain the optimal text translation:

1) And selecting an optimal text translation sentence in a text translation sentence set corresponding to the text translation sentence according to the target weight parameters of the at least two translation models aiming at each text to be translated in the text to be translated, wherein the text translation sentence set consists of at least two text translation sentences corresponding to the text translation sentences output by the at least two translation models.

Wherein, the optimal text translation is preferably determined by the following method:

a) Calculating the translation evaluation score of each text translation sentence in the text translation sentence set corresponding to the text translation sentence to be translated according to the target weight parameters of the at least two translation models and the translation probability of each text translation sentence in the text to be translated output by the at least two translation models;

b) And selecting the text translation sentence with the highest translation evaluation score from the text translation sentence set corresponding to the sentence to be translated as the optimal text translation sentence of the sentence to be translated.

2) And according to the optimal text translation sentences corresponding to all the sentences to be translated in the text to be translated, merging the optimal text translation sentences corresponding to all the sentences to be translated in the text to be translated into the optimal text translation.

For example, the text to be translated is an article text composed of 10 Chinese sentences, and the aim is to translate the article text of the current Chinese sentence into English;

firstly, respectively inputting the text of the article into 3 translation models of a translation Model 1, a translation Model 2 and a translation Model 3 for translation, wherein the translation Model 1 outputs an English translation text1 consisting of 10 English translations, the translation probability of each English translation of the 10 English translations, and the 10 English translations in the English translation text1 correspond to 10 Chinese in the text of the Chinese article; the translation Model 2 and the translation Model 3 are similar to the translation Model 1, and the translation Model 2 outputs an English translation text2 consisting of 10 English translation sentences and the translation probability of each English translation sentence of the 10 English translation sentences; the translation Model 3 outputs English translation text3 composed of 10 English translation sentences and the translation probability of each English translation sentence of the 10 English translation sentences;

As can be seen, each sentence in the text of the chinese article corresponds to one sentence of english translation in english translation text1, english translation text2, and english translation text3, i.e.: each sentence of Chinese corresponds to an English translation sentence set consisting of 3 English translations;

secondly, determining the optimal English translation sentence of each sentence of Chinese in 10 sentences of Chinese in a Chinese article text, wherein the method for determining the optimal English translation sentence of each sentence of Chinese is the same, taking any sentence of Chinese in the Chinese article text as an example, and the English translation sentence set corresponding to the sentence of Chinese consists of 3 English translation sentences: sense 1, sense 2 and sense 3, wherein sense 1 is an english translation sentence that is output after translation of the sentence chinese by the translation Model 1, sense 2 is an english translation sentence that is output after translation of the sentence chinese by the translation Model 2, and sense 3 is an english translation sentence that is output after translation of the sentence chinese by the translation Model 3;

based on this, the translation evaluation scores of english translation sentence sence 1, sence 2 and sence 3 in the english translation sentence set are calculated, the translation evaluation score of english translation sentence sence 1 is equal to the product p1 of the weight parameter of translation Model 1 and the translation probability of sence 1, the translation evaluation score of english translation sentence sence 2 is equal to the product p2 of the weight parameter of translation Model 2 and the translation probability of sence 2, and the translation evaluation score of english translation sentence sence 3 is equal to the product p3 of the weight parameter of translation Model 3 and the translation probability of sence 3; comparing the sizes of p1, p2 and p3, and if p3 is the largest, taking English translation sentence sense 3 corresponding to p3 as the optimal English translation sentence of the Chinese sentence; similarly, respectively determining the optimal English translation sentence of each of 10 Chinese sentences in the text of the Chinese article;

Finally, on the basis of determining the respective optimal English translation of 10 Chinese sentences in the text of the Chinese article, sequentially combining the 10 optimal English translations according to the sentence sequence of the Chinese corresponding to the 10 optimal English translations in the text of the Chinese article, and finally obtaining the English article which is the English translation of the text of the Chinese article.

In summary, in the method for searching the model parameters, in the process of searching the target weight parameters required by the translation task by the translation model in the parameter space, the weight parameters of at least two translation models are searched in the parameter space by combining the corpus in the corpus and the translation probability of the corpus translated by the translation model, so that the searching efficiency is improved, and on the basis of the target weight parameters required by the translation task respectively executed by the at least two translation models obtained by searching, the translation accuracy of the translation model applying the target weight parameters in the process of executing the translation task is higher, and a more accurate translation result is obtained.

The embodiment of the application provides a model parameter searching device, which comprises the following steps:

in the foregoing embodiments, a model parameter searching method is provided, and correspondingly, the present application further provides a model parameter searching apparatus, which is described below with reference to the accompanying drawings.

Referring to fig. 3, a schematic diagram of an embodiment of a model parameter searching apparatus provided by the present application is shown.

Since the apparatus embodiments are substantially similar to the method embodiments, the description is relatively simple, and reference should be made to the corresponding descriptions of the method embodiments provided above for relevant parts. The device embodiments described below are merely illustrative.

The application provides a model parameter searching device, comprising:

the corpus translation module 302 is configured to obtain respective translations of the corpus in the corpus library, which are output after translation is performed on the corpus in the corpus library by at least two translation models, and the translation probability of each translation sentence in the translations;

a weight parameter set search module 304 configured to search a parameter space for weight parameter sets corresponding to the at least two translation models based on the translation and a translation probability of each translation sentence in the translation;

the target weight parameter determining module 306 is configured to take the weight parameters included in the searched target weight parameter set as target weight parameters of the at least two translation models respectively.

Optionally, the weight parameter set search module 304 includes:

Optionally, in the running process of the search tree construction submodule, aiming at a search node in the search tree corresponding to a weight parameter set in the parameter space, taking the weight parameter set as a weight parameter of the at least two translation models according to the weight parameter set in the parameter space corresponding to the search node, and calculating heuristic cost of the search node by combining the translation probability of each translation sentence in the translation;

Optionally, the lower-layer search node of any one search node in the search tree is determined by running the following units:

a search node set determining unit configured to determine a search node set of adjacent search nodes adjacent to the search node in the search tree and having a connection relationship using a gaussian algorithm;

The lower-layer search node determining unit is configured to select at least one adjacent search node with the highest heuristic cost from the search node set as the lower-layer search node of the search nodes according to the heuristic cost of each adjacent search node in the search node set obtained through calculation.

Optionally, the lower-layer search node of any one search node in the search tree is further determined by running the following unit:

the weight parameter determining unit is configured to take the weight parameter set as the weight parameters of the at least two translation models according to the weight parameter set in the parameter space corresponding to the search node for the search node in the search node set;

the reference text translation determining unit is configured to fuse text translations output by the at least two translation models respectively by adopting reordering based on the weight parameters of the at least two translation models to obtain a reference text translation of the text to be translated;

a text translation comparison unit configured to compare the reference text translation with a real translation of the corpus, and determine a translation accuracy and/or a translation loss of the reference text translation relative to the real translation;

The judging unit is configured to judge whether the translation accuracy and/or the translation loss is greater than the translation accuracy and/or the translation loss corresponding to the upper-layer search node of the search node;

Optionally, the weight parameter set search module 304 is implemented based on a cluster search algorithm.

Optionally, the model parameter searching device includes:

Optionally, the translation fusion module includes:

the optimal text translation sentence selection unit is configured to select optimal text translation sentences in a text translation sentence set corresponding to each to-be-translated sentence in the to-be-translated text according to the target weight parameters of the at least two translation models; the text translation sentence set consists of at least two text translation sentences corresponding to the to-be-translated sentences in the text translation text output by the at least two translation models;

And the optimal text translation sentence fusion unit is configured to fuse the optimal text translation sentences corresponding to all the sentences to be translated in the text to be translated into the optimal text translation according to the optimal text translation sentences corresponding to all the sentences to be translated in the text to be translated.

Optionally, the optimal text translation sentence selection unit includes:

the translation evaluation score calculating subunit is configured to calculate the translation evaluation score of each text translation sentence in the text translation sentence set corresponding to the text translation sentence to be translated according to the target weight parameters of the at least two translation models and the translation probability of each text translation sentence in the text to be translated output by the at least two translation models respectively;

and the optimal text translation sentence selection subunit is configured to select a text translation sentence with the highest translation evaluation score from the text translation sentence set corresponding to the sentence to be translated as the optimal text translation sentence of the sentence to be translated.

An embodiment of a computing device provided by the present application is as follows:

fig. 4 is a block diagram illustrating a configuration of a computing device 400 according to an embodiment of the present description. The components of the computing device 400 include, but are not limited to, a memory 410 and a processor 420. Processor 420 is coupled to memory 410 via bus 430 and database 450 is used to hold data.

Computing device 400 also includes access device 440, access device 440 enabling computing device 400 to communicate via one or more networks 460. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. The access device 440 may include one or more of any type of network interface, wired or wireless (e.g., a Network Interface Card (NIC)), such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.

In one embodiment of the present description, the above-described components of computing device 400, as well as other components not shown in FIG. 4, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device shown in FIG. 4 is for exemplary purposes only and is not intended to limit the scope of the present description. Those skilled in the art may add or replace other components as desired.

Computing device 400 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 400 may also be a mobile or stationary server.

The present application provides a computing device comprising a memory 410 and a processor 420;

the memory 410 is configured to store computer-executable instructions that when executed by the processor 420 implement:

Optionally, in the executing process of constructing the search tree instruction based on the translation and the translation probability of each translation sentence in the translation, the following operations are executed for the search node in the search tree corresponding to the weight parameter set in the parameter space:

Optionally, after the search node set instruction of the adjacent search nodes adjacent to the search node and having a connection relationship in the search tree is determined by using a gaussian algorithm, and before the heuristic cost of each adjacent search node in the search node set obtained by calculation is executed, selecting at least one adjacent search node with the highest heuristic cost in the search node set as a lower-layer search node instruction of the search node, the method includes:

Optionally, the searching of the weight parameter set instruction corresponding to the at least two translation models in the parameter space is based on the translation and the translation probability of each translation sentence in the translation, and the realization is based on a cluster searching algorithm.

Optionally, after executing the weight parameters included in the searched target weight parameter set as target weight parameter instructions of the at least two translation models, the method includes:

An embodiment of a computer-readable storage medium provided by the present application is as follows:

an embodiment of the present application also provides a computer-readable storage medium storing computer instructions that, when executed by a processor, are configured to:

The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the model parameter searching method belong to the same concept, and details of the technical solution of the storage medium which are not described in detail can be referred to the description of the technical solution of the model parameter searching method.

The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.

It should be noted that, for the sake of simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all required for the present application.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.

The preferred embodiments of the application disclosed above are intended only to assist in the explanation of the application. Alternative embodiments are not intended to be exhaustive or to limit the application to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the application and the practical application, to thereby enable others skilled in the art to best understand and utilize the application. The application is limited only by the claims and the full scope and equivalents thereof.

Claims

1. A model parameter searching method, comprising:

constructing a search tree based on the translation and the translation probability of each translation sentence in the translation, wherein the weight parameter set in the parameter space corresponds to the search nodes in the search tree one by one, and the construction process of the search tree comprises the following steps: starting from an initial search node, calculating heuristic costs of all search nodes of a next layer, sorting all search nodes of the next layer according to a sorting order of the heuristic costs from high to low, selecting a preset number of search nodes with the highest sorting order as the search nodes of the next layer, and so on, constructing a search tree based on the determined search nodes, wherein the heuristic costs of the search nodes are obtained according to model heuristic costs of each translation model in the at least two translation models, and the model heuristic costs of each translation model are the sum of products of weight parameters of the translation model and translation probability of each translation sentence in the translation;

searching weight parameter sets corresponding to the at least two translation models in the parameter space according to the search tree, wherein the number of weight parameters contained in the weight parameter sets is consistent with the number of translation models in the at least two translation models;

2. The method according to claim 1, wherein in the step of constructing a search tree based on the translation and the translation probability of each translation in the translation, the following operations are performed for the weight parameter set in the parameter space corresponding to the search node in the search tree:

and taking the weight parameter set as the weight parameter of the at least two translation models according to the weight parameter set in the parameter space corresponding to the search node, and calculating the heuristic cost of the search node by combining the translation probability of each translation sentence in the translation.

3. The model parameter searching method according to claim 2, wherein the lower-level searching node of any one searching node in the searching tree is determined by the following method:

4. A model parameter searching method according to claim 3, wherein after the step of determining a search node set of adjacent search nodes adjacent to the search node in the search tree and having a connection relationship by using a gaussian algorithm is performed, and before the step of selecting at least one adjacent search node with the highest heuristic cost from the search node set as a lower search node sub-step of the search node, the method further comprises:

based on the weight parameters of the at least two translation models, adopting reordering to fuse text translations output by the at least two translation models respectively to obtain a reference text translation of the corpus;

5. The method according to claim 1, wherein the step of searching the weight parameter sets corresponding to the at least two translation models in the parameter space based on the translation probability of each translation sentence in the translation is implemented based on a cluster search algorithm.

6. The model parameter searching method according to claim 1, wherein after the step of executing the weight parameters included in the searched target weight parameter set as the target weight parameters of the at least two translation models, respectively, comprises:

7. The method according to claim 6, wherein the text translation output by each of the at least two translation models is composed of at least one text translation, and the translations in the text translations correspond to the sentences to be translated in the text to be translated, respectively.

8. The method for searching model parameters according to claim 7, wherein the merging text translations output by the at least two translation models by reordering, to obtain an optimal text translation of the text to be translated, comprises:

9. The method for searching model parameters according to claim 8, wherein for each sentence to be translated in the text to be translated, selecting an optimal text translation sentence from the text translation sentence set corresponding to the sentence to be translated according to the target weight parameters of the at least two translation models, includes:

10. A model parameter search apparatus, comprising:

the weight parameter set searching module is configured to comprise a searching tree constructing sub-module and a searching sub-module, wherein the searching tree constructing sub-module is configured to construct a searching tree based on the translation and the translation probability of each translation in the translation, the weight parameter set in the parameter space corresponds to searching nodes in the searching tree one by one, and the searching tree constructing process comprises the following steps: starting from an initial search node, calculating heuristic costs of all search nodes of a next layer, sorting all search nodes of the next layer according to a sorting order of the heuristic costs from high to low, selecting a preset number of search nodes with the highest sorting order as the search nodes of the next layer, and so on, constructing a search tree based on the determined search nodes, wherein the heuristic costs of the search nodes are obtained according to model heuristic costs of each translation model in the at least two translation models, and the model heuristic costs of each translation model are the sum of products of weight parameters of the translation model and translation probability of each translation sentence in the translation; the searching submodule is configured to search weight parameter sets corresponding to the at least two translation models in the parameter space according to the search tree, wherein the number of weight parameters contained in the weight parameter sets is consistent with the number of translation models in the at least two translation models;

11. The model parameter searching apparatus according to claim 10, comprising:

12. A computing device, comprising:

a memory and a processor;

the memory is configured to store computer executable instructions that when executed by the processor implement the steps of the model parameter searching method of any one of claims 1 to 9.

13. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the model parameter searching method of any one of claims 1 to 9.