CN111104807A

CN111104807A - Data processing method and device and electronic equipment

Info

Publication number: CN111104807A
Application number: CN201911244108.4A
Authority: CN
Inventors: 施亮亮; 陈伟; 张旭; 卫林钰; 龚力; 阳家俊; 冷永才
Original assignee: Beijing Sogou Technology Development Co Ltd
Current assignee: Beijing Sogou Technology Development Co Ltd
Priority date: 2019-12-06
Filing date: 2019-12-06
Publication date: 2020-05-05

Abstract

The embodiment of the invention provides a data processing method, a data processing device and electronic equipment, wherein the method comprises the following steps: acquiring training data of a specified field; performing fine tuning training on the first universal machine translation model by adopting the training data according to preset optimization information; the preset optimization information comprises a fitting item and an offset adjusting item of the training data, and therefore the translation effect of the machine translation model in the general field can be guaranteed while the translation effect of the machine translation model in the specific field is improved.

Description

Data processing method and device and electronic equipment

Technical Field

The present invention relates to the field of data processing technologies, and in particular, to a data processing method and apparatus, and an electronic device.

Background

Artificial intelligence includes a very broad spectrum of science, consisting of different fields such as machine learning, computer vision, etc. In general, one of the main goals of artificial intelligence research is to enable machines to perform complex tasks that typically require human intelligence to complete; since the birth of artificial intelligence, theories and technologies are mature day by day, and application fields are expanded continuously. Such as in the field of machine translation, for example, translating chinese into english, translating english into chinese, and the like.

In translating one natural language (source language) into another natural language (target language), the same source language word may be translated into a different target language word in a different domain (e.g., everyday spoken language, IT technology, biomedical, etc.). For example, in translating english to chinese, the word "season" in english, for example, is translated to chinese "season" in the life science field and to chinese "season" in the sports field. Therefore, a general machine translation model trained by adopting general field data cannot achieve good translation effect under different fields.

In order to improve the effect of the generic machine translation model in a specific domain, it is often necessary to perform fine-tuning training on the generic machine translation model using data in the specific domain. Although the translation effect of the machine translation model in a specific domain is improved to some extent, the translation effect in a general domain is greatly reduced.

Disclosure of Invention

The embodiment of the invention provides a data processing method, which is used for improving the translation effect of a machine translation model in a specified field and ensuring the translation effect of the machine translation model in a general field.

Correspondingly, the embodiment of the invention also provides a data processing device and electronic equipment, which are used for ensuring the realization and application of the method.

In order to solve the above problem, an embodiment of the present invention discloses a data processing method, which specifically includes: acquiring training data of a specified field; performing fine tuning training on the first universal machine translation model by adopting the training data according to preset optimization information; wherein the preset optimization information comprises a fitting term and an offset adjustment term of the training data.

Optionally, the training data comprises: a source language training text and a corresponding target language reference translation text; the method for performing fine tuning training on the first universal machine translation model by adopting the training data according to the preset optimization information comprises the following steps: inputting the source language training text into the first universal machine translation model, and outputting first prediction probability information of a translation word list corresponding to the first universal machine translation model; and adjusting parameters of the first universal machine translation model according to the target language reference translation text, the first prediction probability information and preset optimization information.

Optionally, the adjusting parameters of the first general machine translation model according to the target language reference translation text, the first prediction probability information, and preset optimization information includes: determining a first optimization value corresponding to a fitting item of the training data according to the target language reference translation text and first prediction probability information; determining a second optimized value corresponding to the offset adjustment item according to the first prediction probability information; and adjusting parameters of the first general machine translation model by taking the minimization of the sum of the first optimization value and the second optimization value as a target.

Optionally, the training data comprises: a source language training text and a corresponding target language reference translation text; the method for performing fine tuning training on the first universal machine translation model by adopting the training data according to the preset optimization information comprises the following steps: inputting the source language training text into the first universal machine translation model, and outputting first prediction probability information of a translation word list corresponding to the first universal machine translation model; inputting the source language training text into a second general machine translation model, and outputting second prediction probability information of a translation word list corresponding to the second general machine translation model; and adjusting parameters of the first universal machine translation model according to the target language reference translation text, the first prediction probability information, the second prediction probability information and preset optimization information.

Optionally, the adjusting parameters of the first general machine translation model according to the target language reference translation text, the first prediction probability information, the second prediction probability information, and preset optimization information includes: determining a first optimization value corresponding to a fitting item of the training data according to the target language reference translation text and first prediction probability information; determining a third optimized value corresponding to the offset adjustment item according to the second prediction probability information; and adjusting the parameters of the first universal machine translation model by taking the sum of the first optimized value and the third optimized value as a target.

Optionally, the fitting term of the training data is a probability distribution function of the target language reference translation text; and the offset adjustment item is a probability distribution function of the translation word list corresponding to the first universal machine translation model.

Optionally, the method further includes the step of determining the preset optimization information: acquiring a fitting item, an offset adjusting item and a hyper-parameter of the training data; multiplying the offset adjustment item and the hyperparameter to obtain a corresponding product fitting item; adding the fitting term of the training data and the product value fitting term to obtain a sum value fitting term; and determining the preset optimization information according to the sum fitting item.

The embodiment of the invention also discloses a data processing device, which specifically comprises: the data acquisition module is used for acquiring training data of a specified field; the training module is used for carrying out fine tuning training on the first universal machine translation model by adopting the training data according to preset optimization information; wherein the preset optimization information comprises a fitting term and an offset adjustment term of the training data.

Optionally, the training data comprises: a source language training text and a corresponding target language reference translation text; the training module comprises: the first forward training submodule is used for inputting the source language training text into the first universal machine translation model and outputting first prediction probability information of a translation word list corresponding to the first universal machine translation model; and the first backward training submodule is used for adjusting parameters of the first universal machine translation model according to the target language reference translation text, the first prediction probability information and preset optimization information.

Optionally, the first backward training sub-module is configured to determine, according to the target language reference translation text and the first prediction probability information, a first optimized value corresponding to a fitting item of the training data; determining a second optimized value corresponding to the offset adjustment item according to the first prediction probability information; and adjusting parameters of the first general machine translation model by taking the minimization of the sum of the first optimization value and the second optimization value as a target.

Optionally, the training data comprises: a source language training text and a corresponding target language reference translation text; the training module comprises: the second forward training submodule is used for inputting the source language training text into the first universal machine translation model and outputting first prediction probability information of a translation word list corresponding to the first universal machine translation model; the probability prediction submodule is used for inputting the source language training text into a second general machine translation model and outputting second prediction probability information of a translation word list corresponding to the second general machine translation model; and the second backward training submodule is used for adjusting parameters of the first universal machine translation model according to the target language reference translation text, the first prediction probability information, the second prediction probability information and preset optimization information.

Optionally, the second backward training sub-module is configured to determine, according to the target language reference translation text and the first prediction probability information, a first optimized value corresponding to a fitting item of the training data; determining a third optimized value corresponding to the offset adjustment item according to the second prediction probability information; and adjusting the parameters of the first universal machine translation model by taking the sum of the first optimized value and the third optimized value as a target.

Optionally, the apparatus further comprises: the information determining module is used for acquiring a fitting item, an offset adjusting item and a hyper-parameter of the training data; multiplying the offset adjustment item and the hyperparameter to obtain a corresponding product fitting item; adding the fitting term of the training data and the product value fitting term to obtain a sum value fitting term; and determining the preset optimization information according to the sum fitting item.

The embodiment of the invention also discloses a readable storage medium, and when the instructions in the storage medium are executed by a processor of the electronic equipment, the electronic equipment can execute the data processing method according to any one of the embodiments of the invention.

An embodiment of the present invention also discloses an electronic device, including a memory, and one or more programs, where the one or more programs are stored in the memory, and configured to be executed by one or more processors, and the one or more programs include instructions for: acquiring training data of a specified field; performing fine tuning training on the first universal machine translation model by adopting the training data according to preset optimization information; wherein the preset optimization information comprises a fitting term and an offset adjustment term of the training data.

Optionally, the method further includes determining the preset optimization information by: acquiring a fitting item, an offset adjusting item and a hyper-parameter of the training data; multiplying the offset adjustment item and the hyperparameter to obtain a corresponding product fitting item; adding the fitting term of the training data and the product value fitting term to obtain a sum value fitting term; and determining the preset optimization information according to the sum fitting item.

The embodiment of the invention has the following advantages:

in the embodiment of the invention, training data in a specified field are obtained, and then the training data are adopted to carry out fine tuning training on a first universal machine translation model according to preset optimization information comprising a fitting item and an offset adjustment item of the training data; and further, the translation effect of the machine translation model in the specified field is improved, and meanwhile, the translation effect of the machine translation model in the general field is guaranteed.

Drawings

FIG. 1 is a flow chart of the steps of one data processing method embodiment of the present invention;

FIG. 2 is a flow chart of the steps of an alternative embodiment of a data processing method of the present invention;

FIG. 3 is a flowchart illustrating steps of one embodiment of a method for determining default optimization information according to the present invention;

FIG. 4 is a flow chart of the steps of yet another alternative embodiment of a data processing method of the present invention;

FIG. 5 is a block diagram of an embodiment of a data processing apparatus of the present invention;

FIG. 6 is a block diagram of an alternate embodiment of a data processing apparatus of the present invention;

FIG. 7 illustrates a block diagram of an electronic device for data processing in accordance with an exemplary embodiment;

fig. 8 is a schematic structural diagram of an electronic device for data processing according to another exemplary embodiment of the present invention.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

One of the core ideas of the embodiment of the invention is that the fine tuning training is carried out on the universal machine translation model according to the optimization information of the fitting item and the offset adjustment item containing the training data, so that the translation effect of the machine translation model in the specified field is improved, and the translation effect of the machine translation model in the universal field is ensured.

The general machine translation model may be a machine translation model trained by using training data in a general field.

Referring to fig. 1, a flowchart illustrating steps of an embodiment of a data processing method according to the present invention is shown, which may specifically include the following steps:

and 102, acquiring training data of a specified field.

104, performing fine tuning training on the first universal machine translation model by adopting the training data according to preset optimization information; wherein the preset optimization information includes a fitting term and an offset adjustment term of the designated training data.

In the embodiment of the invention, when the translation effect of the universal machine translation model in a certain field needs to be improved, the training data in the field can be obtained; and then fine tuning training is carried out on the general machine translation model by adopting the training data in the field. The field in which the translation effect of the generic machine translation model needs to be improved may be referred to as a specific field, where the specific field may be any field, such as a medical field, a biological field, a sports field, and the like, and the embodiment of the present invention is not limited thereto. For convenience of the following description, the general machine translation model that is fine-tuned and trained using the training data of the specified domain may be referred to as a first general machine translation model.

In the embodiment of the invention, the preset optimization information can be determined in advance according to a fitting item and an offset adjustment item of training data; the specific method for determining the preset optimization information is described in the following. The fitting item of the training data is used for fitting the universal machine translation model to the training data of the specified field, and the offset adjusting item is used for adjusting the offset of the universal machine translation model trained by the specified field and the universal machine translation model before training. And then according to preset optimization information, fine tuning training is carried out on the first universal machine translation model by adopting the training data, so that the translation effect of the fine-tuned and trained universal machine translation model in the specified field is improved, and the translation effect of the fine-tuned and trained universal machine translation model in the universal field is ensured.

In summary, in the embodiment of the present invention, training data in a specified field is obtained, and then, according to preset optimization information including a fitting item and an offset adjustment item of the training data, fine tuning training is performed on a first general machine translation model by using the training data; and further, the translation effect of the machine translation model in the specified field is improved, and meanwhile, the translation effect of the machine translation model in the general field is guaranteed.

In this embodiment of the present invention, the training data of the designated field may include: a source language training text and a corresponding target language reference translation text. According to the preset optimization information, one implementation mode of performing fine tuning training on the first universal machine translation model by using the training data may be that after a source language training sample is input into the first universal machine translation model, fine tuning training is performed on the first universal machine translation model according to the information output by the first universal machine translation model and the preset optimization information. The method comprises the following specific steps:

referring to fig. 2, a flowchart illustrating steps of an alternative embodiment of the data processing method of the present invention is shown, which may specifically include the following steps:

step 202, training data of the designated field is obtained.

In the embodiment of the invention, a plurality of source language training texts and corresponding target language reference translation texts can be collected from the related information of the specified field; such as bilingual articles, bilingual famous works, etc. And then, taking a source language training text and a corresponding target language reference translation text as a group of training data to generate a plurality of groups of training data of the specified field. And then carrying out fine tuning training on the first universal machine translation model by adopting the collected multiple groups of training data.

In the embodiment of the invention, the fine tuning training of the first general machine translation model by using the training data according to the preset optimization information may include forward training and reverse training. Wherein, the forward training can refer to step 204, and the reverse training can refer to steps 206-210.

Step 204, inputting the source language training text into the first general machine translation model, and outputting first prediction probability information of a translation word list corresponding to the first general machine translation model.

In this embodiment of the present invention, the process of performing forward training on the first general machine translation model may be: inputting the source language training text in each set of training data into the first universal machine translation model respectively; and translating the source language training text by the first universal machine translation model, and outputting first prediction probability information of a translation word list corresponding to the first universal machine translation model. The first prediction probability information comprises probability information of each word in a translation word list corresponding to the first universal machine translation model. In the embodiment of the present invention, when the target language reference translation text includes a plurality of words, after the source language training text is input to the first general machine translation model, the first general machine translation model outputs corresponding first prediction probability information at a position corresponding to each word in the target language reference translation text.

The first generic machine translation model may then be reverse trained. In this embodiment of the present invention, a process of performing reverse training on the first general machine translation model may be: and adjusting parameters of the first universal machine translation model according to the target language reference translation text, the first prediction probability information and preset optimization information. And adjusting parameters of the first universal machine translation model according to the target language reference translation text, the corresponding first prediction probability information and the preset optimization information in the training data group when the source language training text in the training data group is input. Step 206 and 208 may be referred to.

For convenience of subsequent description, how to adjust parameters of the first general machine translation model according to the target language reference translation text, the first prediction probability information and preset optimization information is achieved, and how to determine the preset optimization information is described first.

Referring to fig. 3, a flowchart illustrating steps of an embodiment of a method for determining preset optimization information according to the present invention is shown.

And step 302, acquiring a fitting item, an offset adjusting item and a hyper-parameter of the training data.

And step 304, multiplying the offset adjustment item and the hyperparameter to obtain a corresponding product fitting item.

And step 306, adding the fitting term of the training data and the product value fitting term to obtain a sum value fitting term.

And 308, determining the preset optimization information according to the sum fitting item.

In the embodiment of the invention, the fitting item and the offset adjustment item of the training data and the hyper-parameter lambda can be obtained, and then the preset optimization information is generated according to the fitting item and the offset adjustment item of the obtained training data and the hyper-parameter lambda. The λ may be set according to an actual situation, and may be an empirical value, which is not limited in this embodiment of the present invention; the translation effect of the trained first general machine translation model in the specified field and the general field can be adjusted by adjusting the lambda.

In an alternative embodiment of the present invention, a probability distribution function of the target language reference translated text may be determined, and the probability distribution function may be used as a fitting term of the training data. The probability distribution function of the target language reference translation text may be a maximum likelihood function, and one expression manner of the fitting term of the training data may be as follows:

wherein J1 is a fit to the training data; when the target language reference translation text includes a plurality of words, one J1 may be calculated for each word in the target language reference translation text. x is an input source language training text. VCorresponding to a translation word list (comprising N words, wherein N is a positive integer) for the first universal machine translation model; y is_iTranslating the ith word in the word list V, wherein the value range of i is 1-N; v belongs to V. Theta is a parameter of the first general machine translation model when a source language training text is input into the first general machine translation model, and p represents probability. v is a certain word in the target language reference translation sample

In one example of the present invention, a probability distribution function of the translation vocabulary corresponding to the first general machine translation model may be determined, and the probability distribution function may be used as an offset adjustment item. The probability distribution function corresponding to the translation word list corresponding to the first general machine translation model may be a cross entropy function, and one expression mode of the offset adjustment item may be as follows:

wherein J3 is an offset adjustment term.

Then multiplying the offset adjustment term by the hyperparameter to obtain a corresponding product fitting term; reference may be made to the following expressions:

J3＝λ*J2

where J3 is the product fit term.

Then, the fitting term of the training data and the product value fitting term can be added to obtain a sum value fitting term; and determining the preset optimization information according to the sum fitting item. In an example of the present invention, the sum fitting term may be determined as the preset optimization information, and one expression of the preset optimization information may be as follows:

and J is preset optimization information, and the value of J is a positive number.

And step 206, determining a first optimized value corresponding to a fitting item of the training data according to the target language reference translation text and the first prediction probability information.

And 208, determining a second optimized value of the offset adjustment item according to the first prediction probability information.

And step 210, adjusting parameters of the first universal machine translation model by taking the sum of the first optimized value and the second optimized value as a target.

In the embodiment of the present invention, when the target language reference translation text includes a plurality of words, an example of calculating the value of J1 and the value of J2 corresponding to the jth word in the target language reference translation text is described.

The method comprises the steps that a first universal machine translation model can be obtained to output first prediction probability information at the jth position, and probability information corresponding to each word in a translation word list is determined according to the first prediction probability information; and calculating the log value of the probability information corresponding to each word in the translation word list. Then multiplying the log value of the probability information corresponding to the jth word in the target language reference translation text by 1 to obtain a corresponding product, and multiplying the log value of the probability information corresponding to other words (N-1 words) in the translation word list by 0 to obtain corresponding N-1 products; the sum of these N products is then calculated to obtain the first optimized value (i.e., the value of J1 for the jth word in the target language reference translation text). Then, the first optimized values corresponding to the words in the target language reference translation text may be added to obtain the first optimized value corresponding to the target language reference translation text.

In the embodiment of the invention, the log value of the probability information corresponding to each word in the translation word list can be calculated, and the product of the probability information of each word in the translation word list and the log of the probability information can be calculated to obtain N products. The sum of these N products is then calculated to obtain a second optimized value (i.e., the value of J2 for the jth word in the target language reference translation text). Then, the second optimized values corresponding to the words in the target language reference translation text may be added to obtain the second optimized values corresponding to the target language reference translation text.

Iteration may then be performed with the goal of minimizing the sum of the first and second optimized values, adjusting the parameter θ of the first generic machine translation model.

In summary, in the embodiment of the present invention, training data in a specified field is obtained, then the source language training text is input into the first general machine translation model, and first prediction probability information of a translation vocabulary corresponding to the first general machine translation model is output; determining a first optimized value corresponding to a fitting item of the training data according to the target language reference translation text and first prediction probability information, determining a second optimized value corresponding to the offset adjusting item according to the first prediction probability information, and adjusting parameters of the first universal machine translation model by taking the sum of the minimized first optimized value and the minimized second optimized value as a target; and further, the translation effect of the machine translation model in the specified field can be further improved, and the translation effect of the machine translation model in the general field can be guaranteed.

In another embodiment of the present invention, another implementation manner of performing fine tuning training on the first general machine translation model by using the training data according to the preset optimization information may be that source language training samples are respectively input to the first general machine translation model and the second general machine translation model, and then fine tuning training is performed on the first general machine translation model according to information output by the first general machine translation model, information output by the second general machine translation model, and the preset optimization information. The method comprises the following specific steps:

the second general machine translation model may also be a machine translation model trained using training data in a general field, and may be the same model as the first general machine translation model not trained using training data in a specific field.

Referring to fig. 4, a flowchart illustrating steps of another alternative embodiment of the data processing method of the present invention is shown, which may specifically include the following steps:

step 402, training data of a specified field is obtained.

This step 402 is similar to the step 202 described above and will not be described herein again.

In the embodiment of the present invention, according to the preset optimization information, the step 404 may be referred to for forward training for performing the fine tuning training on the first general machine translation model by using the training data, and the step 408 to the step 412 may be referred to for the reverse training.

Step 404, inputting the source language training text into the first general machine translation model, and outputting first prediction probability information of a translation word list corresponding to the first general machine translation model.

This step 404 is similar to the step 204, and will not be described herein again.

And 406, inputting the source language training text into a second general machine translation model, and outputting second prediction probability information of a translation word list corresponding to the second general machine translation model.

In the embodiment of the invention, the second general machine translation model can be adopted to calculate the optimized value of the offset adjustment item in the preset optimized information aiming at the information output by the source language training text. Therefore, the source language training text in each set of training data can be input into the second general machine translation model, the second general machine translation model translates the source language training text, and second prediction probability information of the translation word list corresponding to the second general machine translation model is output. Wherein the translation vocabulary of the first general machine translation model and the translation vocabulary of the second general machine translation model may be the same.

And then, parameters of the first universal machine translation model can be adjusted according to the target language reference translation text, the first prediction probability information, the second prediction probability information and preset optimization information. For each group of training data, a first universal machine translation model can be adopted to adjust parameters of the first universal machine translation model according to first prediction probability information output by a source language training text in the group of training data, and a second universal machine translation model can be adopted to adjust parameters of the first universal machine translation model according to second prediction probability information and preset optimization information output by the source language training text in the group of training data. Reference may be made to steps 408-412:

and step 408, determining a first optimized value corresponding to the fitting item of the training data according to the target language reference translation text and the first prediction probability information.

And step 410, determining a third optimization value of the offset adjustment item according to the second prediction probability information.

Step 412, adjusting the first generic machine translation model parameter with the goal of minimizing the sum of the first optimized value and the third optimized value.

The steps 408-412 are similar to the steps 206-210 described above and will not be described herein again.

In step 410, θ in the offset adjustment term is a parameter corresponding to the second general machine translation model; and in the process of carrying out fine tuning training on the first general machine translation model, the parameter theta corresponding to the second general machine translation model is kept unchanged.

In summary, in the embodiment of the present invention, training data in a specified field is obtained, then the source language training text is input into the first general machine translation model, first prediction probability information of a translation vocabulary corresponding to the first general machine translation model is output, the source language training text is input into the second general machine translation model, and second prediction probability information of a translation vocabulary corresponding to the second general machine translation model is output; determining a first optimized value corresponding to a fitting item of the training data according to the target language reference translation text and first prediction probability information, determining a third optimized value corresponding to the offset adjusting item according to the second prediction probability information, and adjusting the parameters of the first universal machine translation model by taking the sum of the minimized first optimized value and the minimized third optimized value as a target; and further, the translation effect of the machine translation model in the specified field can be further improved, and the translation effect of the machine translation model in the general field can be guaranteed.

In addition, the translation effect of the first generic machine learning model after the fine tuning training according to the steps 402-412 is better than that of the first generic machine learning model after the fine tuning training according to the steps 202-210.

As an example of the present invention, the designated field is a sports field, training data of the sports field is obtained, and then the training data of the sports field is used to perform fine tuning training on the first generic machine translation model according to preset optimization information; the first generic machine translation model after the fine-tuning training can then be used for translation. For example, translating english into chinese, for example, inputting the english text "After the search, a referred from the keywords" into the first general machine translation model After the fine tuning training, and obtaining the first general machine translation model After the fine tuning training to output the chinese text "After the season, a is retired from the lake". For another example, the english text "This search, suitable for soakingfet" is input to the first general-purpose machine translation model after the fine tuning training, and the first general-purpose machine translation model after the fine tuning training is obtained to output the chinese text "This season, which is suitable for the foot bath". Therefore, the first general machine translation model after the fine tuning training can accurately translate 'season' into the Chinese expression in the sports field and the Chinese expression in the general field.

It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.

Referring to fig. 5, a block diagram of a data processing apparatus according to an embodiment of the present invention is shown, which may specifically include the following modules:

a data obtaining module 502, configured to obtain training data of a specified field;

the training module 504 is configured to perform fine tuning training on the first general machine translation model by using the training data according to preset optimization information; wherein the preset optimization information comprises a fitting term and an offset adjustment term of the training data.

Referring to fig. 6, a block diagram of an alternative embodiment of a data processing apparatus of the present invention is shown.

In an optional embodiment of the present invention, the training data includes: a source language training text and a corresponding target language reference translation text; the training module 504 includes:

the first forward training submodule 5042 is configured to input the source language training text into the first general machine translation model, and output first prediction probability information of a translation word list corresponding to the first general machine translation model;

and the first backward training submodule 5044 is configured to adjust parameters of the first general machine translation model according to the target language reference translation text, the first prediction probability information, and preset optimization information.

In an optional embodiment of the present invention, the first backward training sub-module 5044 is configured to determine, according to the target language reference translation text and the first prediction probability information, a first optimized value corresponding to a fitting term of the training data; determining a second optimized value corresponding to the offset adjustment item according to the first prediction probability information; and adjusting parameters of the first general machine translation model by taking the minimization of the sum of the first optimization value and the second optimization value as a target.

the second forward training submodule 5046 is used for inputting the source language training text into the first general machine translation model and outputting first prediction probability information of a translation word list corresponding to the first general machine translation model;

the probability prediction submodule 5048 is used for inputting the source language training text into a second general machine translation model and outputting second prediction probability information of a translation word list corresponding to the second general machine translation model;

and the second backward training submodule 50410 is configured to adjust parameters of the first general machine translation model according to the target language reference translation text, the first prediction probability information, the second prediction probability information, and preset optimization information.

In an optional embodiment of the present invention, the second backward training sub-module 50410 is configured to determine, according to the target language reference translation text and the first prediction probability information, a first optimized value corresponding to a fitting term of the training data; determining a third optimized value corresponding to the offset adjustment item according to the second prediction probability information; and adjusting the parameters of the first universal machine translation model by taking the sum of the first optimized value and the third optimized value as a target.

In an optional embodiment of the present invention, the fitting term of the training data is a probability distribution function of the target language reference translation text; and the offset adjustment item is a probability distribution function of the translation word list corresponding to the first universal machine translation model.

In an optional embodiment of the present invention, the apparatus further comprises:

an information determination module 506, configured to obtain a fitting term and an offset adjustment term of the training data, and a hyper-parameter; multiplying the offset adjustment item and the hyperparameter to obtain a corresponding product fitting item; adding the fitting term of the training data and the product value fitting term to obtain a sum value fitting term; and determining the preset optimization information according to the sum fitting item.

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

Fig. 7 is a block diagram illustrating an architecture of an electronic device 700 for data processing in accordance with an exemplary embodiment. For example, the electronic device 700 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.

Referring to fig. 7, electronic device 700 may include one or more of the following components: a processing component 702, a memory 704, a power component 706, a multimedia component 708, an audio component 710, an input/output (I/O) interface 712, a sensor component 714, and a communication component 716.

The processing component 702 generally controls overall operation of the electronic device 700, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing element 702 may include one or more processors 720 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 702 may include one or more modules that facilitate interaction between the processing component 702 and other components. For example, the processing component 702 can include a multimedia module to facilitate interaction between the multimedia component 708 and the processing component 702.

The memory 704 is configured to store various types of data to support operation at the device 700. Examples of such data include instructions for any application or method operating on the electronic device 700, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 704 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power component 706 provides power to the various components of the electronic device 700. The power components 706 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the electronic device 700.

The multimedia component 708 includes a screen that provides an output interface between the electronic device 700 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 708 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the electronic device 700 is in an operation mode, such as a photographing mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 710 is configured to output and/or input audio signals. For example, the audio component 710 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 700 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in the memory 704 or transmitted via the communication component 716. In some embodiments, audio component 710 also includes a speaker for outputting audio signals.

The I/O interface 712 provides an interface between the processing component 702 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 714 includes one or more sensors for providing various aspects of status assessment for the electronic device 700. For example, the sensor assembly 714 may detect an open/closed state of the device 700, the relative positioning of components, such as a display and keypad of the electronic device 700, the sensor assembly 714 may also detect a change in the position of the electronic device 700 or a component of the electronic device 700, the presence or absence of user contact with the electronic device 700, orientation or acceleration/deceleration of the electronic device 700, and a change in the temperature of the electronic device 700. The sensor assembly 714 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 714 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 714 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 716 is configured to facilitate wired or wireless communication between the electronic device 700 and other devices. The electronic device 700 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 714 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 714 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 700 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as the memory 704 comprising instructions, executable by the processor 720 of the electronic device 700 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

A non-transitory computer readable storage medium in which instructions, when executed by a processor of an electronic device, enable the electronic device to perform a data processing method, the method comprising: acquiring training data of a specified field; performing fine tuning training on the first universal machine translation model by adopting the training data according to preset optimization information; wherein the preset optimization information comprises a fitting term and an offset adjustment term of the training data.

Fig. 8 is a schematic structural diagram of an electronic device 800 for data processing according to another exemplary embodiment of the present invention. The electronic device 800 may be a server, which may vary widely due to configuration or performance, and may include one or more Central Processing Units (CPUs) 822 (e.g., one or more processors) and memory 832, one or more storage media 830 (e.g., one or more mass storage devices) storing applications 842 or data 844. Memory 832 and storage medium 830 may be, among other things, transient or persistent storage. The program stored in the storage medium 830 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Still further, the central processor 822 may be configured to communicate with the storage medium 830 to execute a series of instruction operations in the storage medium 830 on the server.

The server may also include one or more power supplies 826, one or more wired or wireless network interfaces 850, one or more input-output interfaces 858, one or more keyboards 856, and/or one or more operating systems 841, such as Windows Server, Mac OSXTM, UnixTM, LinuxTM, FreeBSDTM, etc.

An electronic device comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors the one or more programs including instructions for: acquiring training data of a specified field; performing fine tuning training on the first universal machine translation model by adopting the training data according to preset optimization information; wherein the preset optimization information comprises a fitting term and an offset adjustment term of the training data.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.

The data processing method, the data processing apparatus and the electronic device provided by the present invention are described in detail above, and specific examples are applied herein to illustrate the principles and embodiments of the present invention, and the descriptions of the above embodiments are only used to help understand the method and the core ideas of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A data processing method, comprising:

acquiring training data of a specified field;

performing fine tuning training on the first universal machine translation model by adopting the training data according to preset optimization information;

wherein the preset optimization information comprises a fitting term and an offset adjustment term of the training data.

2. The method of claim 1, wherein the training data comprises: a source language training text and a corresponding target language reference translation text;

the method for performing fine tuning training on the first universal machine translation model by adopting the training data according to the preset optimization information comprises the following steps:

inputting the source language training text into the first universal machine translation model, and outputting first prediction probability information of a translation word list corresponding to the first universal machine translation model;

and adjusting parameters of the first universal machine translation model according to the target language reference translation text, the first prediction probability information and preset optimization information.

3. The method of claim 2, wherein adjusting parameters of the first generic machine translation model based on the target language reference translation text, the first prediction probability information, and the preset optimization information comprises:

determining a first optimization value corresponding to a fitting item of the training data according to the target language reference translation text and first prediction probability information;

determining a second optimized value corresponding to the offset adjustment item according to the first prediction probability information;

and adjusting parameters of the first general machine translation model by taking the minimization of the sum of the first optimization value and the second optimization value as a target.

4. The method of claim 1, wherein the training data comprises: a source language training text and a corresponding target language reference translation text;

inputting the source language training text into a second general machine translation model, and outputting second prediction probability information of a translation word list corresponding to the second general machine translation model;

and adjusting parameters of the first universal machine translation model according to the target language reference translation text, the first prediction probability information, the second prediction probability information and preset optimization information.

5. The method of claim 4, wherein adjusting parameters of the first generic machine translation model according to the target language reference translation text, the first prediction probability information, the second prediction probability information, and preset optimization information comprises:

determining a third optimized value corresponding to the offset adjustment item according to the second prediction probability information;

and adjusting the parameters of the first universal machine translation model by taking the sum of the first optimized value and the third optimized value as a target.

6. The method according to any one of claims 2 to 5,

the fitting items of the training data are probability distribution functions of the target language reference translation texts;

and the offset adjustment item is a probability distribution function of the translation word list corresponding to the first universal machine translation model.

7. The method of claim 1, further comprising the step of determining the preset optimization information:

acquiring a fitting item, an offset adjusting item and a hyper-parameter of the training data;

multiplying the offset adjustment item and the hyperparameter to obtain a corresponding product fitting item;

adding the fitting term of the training data and the product value fitting term to obtain a sum value fitting term;

and determining the preset optimization information according to the sum fitting item.

8. A data processing apparatus, comprising:

the data acquisition module is used for acquiring training data of a specified field;

the training module is used for carrying out fine tuning training on the first universal machine translation model by adopting the training data according to preset optimization information; wherein the preset optimization information comprises a fitting term and an offset adjustment term of the training data.

9. A readable storage medium, characterized in that instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the data processing method according to any of method claims 1-7.

10. An electronic device comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors the one or more programs including instructions for:

acquiring training data of a specified field;