CN115688904B - Translation model construction method based on noun translation prompt - Google Patents

Translation model construction method based on noun translation prompt Download PDF

Info

Publication number
CN115688904B
CN115688904B CN202211348033.6A CN202211348033A CN115688904B CN 115688904 B CN115688904 B CN 115688904B CN 202211348033 A CN202211348033 A CN 202211348033A CN 115688904 B CN115688904 B CN 115688904B
Authority
CN
China
Prior art keywords
noun
translation
model
data
list
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211348033.6A
Other languages
Chinese (zh)
Other versions
CN115688904A (en
Inventor
迟雨桐
冯少辉
李鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Iplus Teck Co ltd
Original Assignee
Beijing Iplus Teck Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Iplus Teck Co ltd filed Critical Beijing Iplus Teck Co ltd
Priority to CN202211348033.6A priority Critical patent/CN115688904B/en
Publication of CN115688904A publication Critical patent/CN115688904A/en
Application granted granted Critical
Publication of CN115688904B publication Critical patent/CN115688904B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Machine Translation (AREA)

Abstract

The invention relates to a translation model construction method based on noun translation prompt, belongs to the technical field of natural language processing, and solves the problems of inaccurate noun and proper noun translation, missing turn and wrong turn of a machine translation model in the prior art. Performing noun recognition on parallel linguistic data of two languages to be translated to obtain an original noun set and a translated noun set, and further obtaining a training sample of a translation model to be trained and a corresponding adjustment matrix; and (3) using an adjustment matrix to adjust the attention calculation of the model, training the translation model by taking the text after the noun translation prompt as input and taking the translation as target output, and obtaining the finally trained translation model NPTrans. Based on the input data containing noun translation prompts and the adjustment of the adjustment matrix, the accuracy of noun translation of the translation model is ensured to a certain extent, the problems of noun miss-translation and mistranslation are solved, and the accuracy of noun translation of the machine translation model is improved.

Description

Translation model construction method based on noun translation prompt
Technical Field
The invention relates to the technical field of natural language processing, in particular to a translation model construction method based on noun translation prompt.
Background
Machine translation is one of the important directions of artificial intelligence, also called automatic translation, is a process of converting one natural language (source language) into another natural language (target language) using a computer. With the globalization of economy and the rapid development of the Internet, the machine translation technology plays an increasingly important role in promoting political, economic and cultural exchanges among countries, so that the research on the machine translation technology has important practical significance.
At the beginning of the proposal of the machine translation technology, the method is based on statistical machine translation (SMT, statistics-based Machine Translation), which regards translation as a probability problem, and directly carries out disambiguation treatment and translation selection according to the statistical result, thereby avoiding the difficult problem of language understanding. However, due to the huge selection and processing engineering amount of the corpus, the machine translation system in the general field is seldom based on a statistical method. In recent years, the neural network machine translation (NMT, neural Machine Translation) based on the deep learning network is widely used, and the multi-layer network structure can well learn the context information of the original text, extract semantic features and generate smoother and normative translations, so that the machine translation quality is improved.
However, deep learning-based methods have some drawbacks, among which the most important is the problem of incorrect noun and proper noun translations. Nouns and proper nouns are not translated accurately, including both, miss-translation (i.e., directly skipping a noun or a segment of a noun without translation) and mistranslation (i.e., translation errors), where miss-translation problems are particularly acute when there are few translation languages and training examples. Because the existing machine translation models have the problems of inaccurate, missed and misplaced translations of nouns and proper nouns, there is a great need for a machine translation model for ensuring the accuracy of noun translation.
Disclosure of Invention
In view of the above analysis, the present invention aims to provide a method for constructing a translation model based on noun translation hint, which is used to solve the problems of inaccurate, missed and wrong translations of nouns and proper nouns in the existing machine translation model.
In one aspect, the embodiment of the invention provides a translation model construction method based on noun translation prompting, which comprises the following steps:
acquiring parallel corpus data of two languages to be translated to obtain a data set D;
identifying the original text and the translated text of nouns in each piece of data in the data set D to obtain an original text noun set S of each piece of data word And translated noun set S word-trans
Training sample X of all data in D is obtained through data construction input Adjustment matrix M corresponding to all data train Wherein X is input =[x 1 ,x 2 ,…,x g ],M train =[M 1 ,M 2 ,…,M g ]Single training sample x i Is to increase the text x after noun translation prompt input And target translation x gold I.e. [1,2, …, g)]G is the number of data bars;
the adjustment matrix M train Importing a translation model to be trained, and using the trainingTraining sample X input Training a model to obtain a finally trained translation model NPTrans.
Further, the data construction comprises the following steps:
carrying out data cleaning on the parallel corpus data to obtain a cleaned data set D;
the original text of each data in the data set D is sequentially spliced with noun translation set S word-trans Translated version of all nouns in the translation model to obtain input text X of the translation model input
Constructing each text x after the addition of noun translation prompts input List of corresponding positional relationships of (a) index According to the List index Determining the value M of the element in the construction adjustment matrix ij Inserting special symbols into the start and stop rows to obtain the single training sample x i Corresponding adjustment matrix M g Thereby obtaining an adjustment matrix M corresponding to all the data train
Further, each text x after the noun translation increasing prompt is constructed input List of corresponding positional relationships of (a) index The method comprises the following steps:
inputting each piece of input text x of the translation model input Each pair of nouns and noun translations in x input Is represented by a pair of tuples;
each noun-translation position tuple pair forms a sub-list;
connecting all noun-translation position tuple pairs to the sub-List to form a List of the corresponding position relation of the text index
Further, the adjustment matrix element M ij The values and constraints of (2) are as follows:
wherein len (x) i0 ) Original text representing a single training sample after washing and before adding translation hintsThe X is i0 Length of len (List) index ) Representing the List index Length of (i.e. number of sub-lists), list index [z][0]Representing List index The first tuple in the z-th sub-List, list index [z][1]Representing List index The second tuple in the z-th sub-list.
Further, the adjustment matrix M is introduced by calculation using the following function train Attention of the posterior model:
wherein Q is i 、K i 、V i Is to calculate x i Query, key, value matrices at attention,is Q i Or K i Is a dimension of (2);
calculating the prediction result x using the following function pred And target result x gold Loss between:
Loss=CrossEntropy(x pred ,x gold )
minimizing Loss and updating model weights, and training until Loss is no longer reduced;
calculating the accuracy of model translation using the following function:
wherein p is n For the prediction result x pred The correct n-gram ratio is predicted and BP is the penalty factor.
Further, the matrix M is adjusted train Importing a translation model, comprising:
the maximum length L which can be input and preset according to the model max Expanding the adjustment matrix M to the right and downwards to a size L by adding 0 value elements max ×L max Obtaining M train ’;
Will M train ' import coding layer of the translation model.
Further, identifying and obtaining a noun set S included in the data set D word Comprising:
marking nouns in the original text of each piece of data in the data set D by using a marking tool with built-in part of speech; or alternatively
And performing noun recognition on the original text in the data set D by using a noun recognition model trained according to requirements.
Further, the noun set S is obtained word The noun translations corresponding to all nouns in (a) include:
by querying dictionary subject noun Acquiring the original noun set S word Translation w of noun w to be matched in trans
Searching whether the translation w exists in the translations of the parallel corpus in the data set D trans Matched words, if present, will w trans Adding the translated noun set S word-trans If there is no word set S of w from the original text word And deleted.
Further, the method obtains the translation w of the noun original word w in the data set D trans Comprising:
1) Acquiring the original noun set S word To be matched in the word w;
2) Directly inquiring dictionary by taking w as key name noun If the corresponding value exists, the value is directly taken as a translation, and if the corresponding value exists, the next step is carried out;
3) Calculating the noun w to be matched and dictionary part noun All keys = { keys in (2) 1 ,key 2 ,…,key x Similarity score of } to obtain a score set s= { S 1 ,s 2 ,…,s x X is a subject noun Length of (2)
4) Finding the element position with the maximum value in the score set S, and randomly taking one element as the maximum value element if the number of the maximum elements in the score set S is greater than 1;
5) Finding out the dictionary according to the element position with the maximum value noun Key-to-key in (a) max And value max Using value max As translations.
Further, the translation model to be trained is constructed based on a transformers framework and comprises an encoder and a decoder, wherein the encoder and the decoder comprise multiple layers of identical self-attention residual structures.
Compared with the prior art, the invention has at least one of the following beneficial effects:
1. the translation model is trained by constructing a training set containing noun translation prompts and an adjusting matrix in advance, so that the model learns the internal relation between nouns and noun translations, and the accuracy of the translation model on noun translation based on prompts is improved.
2. The attention calculation of the model is regulated by constructing the regulating matrix, so that the model does not calculate the attention between the noun translation and other original characters any more, only the attention between the noun translation and the noun in the original text is calculated, and the accuracy of the model is improved.
3. By constructing input data containing noun translation prompts and an adjusting matrix, a translation model can accurately translate nouns, and the problems of inaccurate noun and proper noun translation, missing and wrong turning of the existing translation model are solved.
In the invention, the technical schemes can be mutually combined to realize more preferable combination schemes. Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.
Drawings
The drawings are only for purposes of illustrating particular embodiments and are not to be construed as limiting the invention, like reference numerals being used to refer to like parts throughout the several views.
FIG. 1 is a flow chart of a translation model construction method based on noun translation hints according to an embodiment of the invention;
FIG. 2 is a schematic block diagram of a method for constructing a translation model based on noun translation hints according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a method for constructing an adjustment matrix according to an embodiment of the present invention;
Detailed Description
Preferred embodiments of the present invention will now be described in detail with reference to the accompanying drawings, which form a part hereof, and together with the description serve to explain the principles of the invention, and are not intended to limit the scope of the invention.
In one embodiment of the present invention, as shown in fig. 1, a method for constructing a translation model based on noun translation hints is disclosed, comprising:
step S110, acquiring parallel corpus data of two languages to be translated, and obtaining a data set D; the two languages to be translated refer to two languages before and after translation; parallel corpus data refers to a corpus set formed by original text sentences and translation sentences corresponding to the original text sentences.
Step S120, identifying the original text and the translated text of the nouns in the data set D to obtain an original text noun set S word And translated noun set S word-trans The method comprises the steps of carrying out a first treatment on the surface of the Specifically, the noun set S can be obtained by noun recognition by the text input noun recognition module in the dataset D word . S can be searched out through a noun inter-translation dictionary built in a noun query module word The noun translations corresponding to all nouns in the data set D are searched for whether the words matched with the noun translations exist in the translations of the parallel corpus in the data set D, and if so, the words matched with the noun translations exist, and the words are added into the translated noun set S word-trans If there is no noun set S from the original text word Delete in the middle; preferably, the important original text nouns in the words which are not matched with the noun translations can be screened out by a manual identification mode, and the translated nouns in the corresponding parallel corpus are used as raw words to be added into a dictionary.
Step S130, constructing by dataTraining sample X to all data in D input Adjustment matrix M corresponding to all data train Wherein X is input =[x 1 ,x 2 ,…,x g ],M train =[M 1 ,M 2 ,…,M g ]Single training sample x i Is to increase the text x after noun translation prompt input And target translation x gold I.e. [1,2, …, g)]G is the number of data bars; wherein the noun translation prompt is noun translation set S word-trans All translations of (a) are described.
Step S140, the adjustment matrix M train Importing a translation model to be trained, and using the training sample X input Training a model to obtain a finally trained translation model NPTrans. Specifically, by the adjustment matrix M train After adjustment, the translation model does not calculate the attention between noun translations and other original characters when calculating the attention, and only calculates the attention between nouns and the nouns in the original.
According to the embodiment of the invention, the translation model is trained by constructing the training set and the adjusting matrix containing the noun translation prompts in advance, so that the model learns the internal relation between nouns and noun translations, and the accuracy of the translation model on noun translation based on prompts is improved; the attention calculation of the model is regulated by constructing the regulating matrix, so that the model does not calculate the attention between noun translation and other original characters any more, only the attention between noun translation and noun in the original text is calculated, and the accuracy of the model is improved; by constructing input data containing noun translation prompts and an adjusting matrix, a translation model can accurately translate nouns, and the problems of inaccurate noun and proper noun translation, missing and wrong turning of the existing translation model are solved.
In a specific embodiment, the noun recognition module in step S120 is a built-in part-of-speech tagging tool or a noun recognition model trained according to requirements. Optionally, the marking tool with built-in part of speech is a jieba word segmentation kit.
In a specific embodiment, the noun query module in the step S120 includes a noun dictionary and a query program;
wherein the noun dictionary is a dictionary (direct) containing all nouns required by a user, the keys of the dictionary are nouns expressed by the language to be translated, and the values are the corresponding nouns expressed by the target language;
illustratively, the data structure of the dictionary (middle translation) is:
dictonoun= { "China": "China", "us": "american", … … }
Alternatively, the user may construct the noun dictionary by using existing resources, self-building, etc.
The matching mode of the query program adopts non-precise matching; preferably, matching is performed by using a text similarity algorithm;
further, the text similarity algorithm matching step includes:
1. directly inquiring dictionary subject by taking any noun w to be matched as key name noun If there is a corresponding value, if there is, directly taking the value as the translation w trans If not, carrying out the next step;
2. calculating the noun w to be matched and the dictionary part noun All keys = { keys in (2) 1 ,key 2 ,…,key x Similarity score of } to obtain a score set s= { S 1 ,s 2 ,…,s x X is a subject noun Is a length of (2); the similarity score calculation formula is as follows:
where len (w) is the length of word w, len (key i ) Exp (·) is the desired function, count, for the length of the ith bond same For w and key i Number of overlapping grams under n-gram, count n-gram For w, the number of grams under n-grams, n takes 1-3.
3. And finding the element position with the maximum value in the score set S, and randomly taking one element as the maximum element if the number of the maximum elements in the score set S is greater than 1.
4. Finding out the dictionary according to the element position with the maximum value noun Key-to-key in (a) max And value max Using value max As translation w trans
In a specific embodiment, the data construction in step S130 includes: data cleaning, training sample construction and adjustment matrix construction;
the data cleaning is to clean parallel corpus data to obtain a cleaned data set D, and the cleaning includes: removing the blank and redundant invalid characters; unified simplified (e.g., chinese);
the constructing training samples includes: sequentially splicing corresponding noun translation sets S after the original text of each piece of data in the data set D word-trans Space division is used between translations of the medium nouns to obtain a single training sample x of the translation model i Thereby obtaining a training sample set X input
The construction of the adjustment matrix comprises: building each text x after increasing noun translation prompt input List of corresponding positional relationships of (a) index The method comprises the steps of carrying out a first treatment on the surface of the According to the List index Building a single training sample x i Corresponding adjustment matrix M g Thereby obtaining an adjustment matrix M corresponding to all the data train
Wherein, the construction increases each text x after noun translation prompt input List of corresponding positional relationships of (a) index The method comprises the following steps of:
prompting each text x after the noun translation is added input Each noun or noun translation in x input Is represented by a tuple; each noun-translation position tuple pair forms a sub-list; all noun-translation position tuple pairs are connected to form a large List, namely a List index . By way of example, table 1 illustrates a method of constructing a list of input text and corresponding positional relationships.
Table 1 example of constructing a list of input text and position correspondences
Original text Chinese and American trade
S word { China, U.S. }, U.S
S word-trans {China,America})
X input Chinese and United states trade transactions. China America
List indesx [[(1,2),(11,11)],[(4,5),(12,12)]]
It should be noted that the above List indesx Is only an example, and in practice it is also necessary to use the X input The word segmentation results of the Chinese text are adjusted, wherein the default Chinese text is segmented according to single words, the English text is segmented according to words, and spaces among English words are not counted in the word segmentation.
The construction of a single training sample x i Corresponding adjustment matrix M g As shown in fig. 3, includes: according to the List index Determining the single training sample x i Corresponding adjustment matrix M g The value of the medium element; the start and end rows of the matrix are inserted with special symbols, respectively, such that l=len (x input ) +2; optionally, the start line and the end line of the matrix are inserted respectivelySpecial symbol is [ CLS ]]And [ SEP ]];
Further, M g Element M of a certain ith row and jth column i,j The values and constraints of (2) are as follows:
(1) M when i, j each satisfy any one of the following conditions i,j =0;
Condition 1: less than or equal to len (x i0 )+1
Condition 2: equal to L
(2) When i, j respectively belong to List index List of a certain sub-List in (a) one Two tuple List in (a) one [0]And List one [1]When M is i,j =0;
(3) The rest of the cases, M i,j Negative infinity (- ≡);
alternatively, 1e-4 or 1e-9 is used instead of minus infinity (- ≡);
alternatively, when 1e-4 is used instead of minus infinity (- ≡), M i,j The expression of the value of (c) is as follows:
wherein len (x) i0 ) Representing the original text x after cleaning and before adding prompt 0 Length of len (List) index ) Representing List index List (number of sub-lists) index [z][0]Representing List index The first tuple in the z-th sub-List, list index [z][1]Representing List index The second tuple in the z-th sub-list.
In a specific embodiment, the translation model to be trained employs a neural machine translation model that includes an encoder and a decoder. As shown in fig. 2, the above step S140 may be further optimized as the following steps:
step S210: the adjustment matrix M train Importing the coding layer of the translation model to be trained, and adjusting the calculation of parameters in the model;
specifically, the adjustment matrix M train Importing the training to be trainedAn encoding layer of a translation model, comprising: the maximum input length L preset according to the translation model to be trained max Expanding the adjustment matrix M to the right and downwards to a size L by adding 0 value elements max ×L max Obtaining M train 'A'; will M train ' import the coding layer;
specifically, the neural machine translation model is constructed by using a transformers framework, and comprises an encoder and a decoder, wherein the encoder and the decoder both comprise multiple layers of identical self-Attention residual structures, and an adjustment matrix is added to calculate self-Attention (Attention), and the calculation formula is as follows:
wherein Q, K, V is a matrix of Query, key, value, d in self-attention mechanism k Is the dimension of Q or K (both identical).
Preferably, the encoder and decoder each comprise 12 layers of identical self-attention residual structures;
preferably, the dimension value of the Query or Key in the self-attention mechanism is d k =64。
Exemplary, as in FIG. 3, an adjustment matrix M is added train After' the neural machine translation model will not calculate the attention (grey part) between the noun translation and other original characters, but only the attention (a in the figure) between the noun translation and the noun in the original 1,11 ,a 2,11 ,a 4,12 ,a 4,12 Four) and the remaining white part is the original x 0 Attention in between.
Step 220, using the training sample X input Training import adjustment matrix M train And obtaining a finally trained translation model NPTrans by the aid of the post-translation model.
Specifically, X is input Divided into training sets D train Verification set D valid Test set D test The adjustment matrix M train Importing the translation model with D train Training model, D for finishing each round of training valid And (3) performing verification, and taking a round of model with the best verification result as a final model NPTrans. Preferably, the training set D train Verification set D valid Test set D test The ratio of (2) is 8:1:1.
Further, in training, for each text x i The attention of the piece of text at the encoder is calculated using the following formula:
wherein Q is i 、K i 、V i Is to calculate x i Query, key, value matrices at attention,is Q i Or K i Is the same as the dimension of (a) and is generally taken as d k =64。
Prediction result x pred And target result x gold The loss function expression between them is:
Loss=CrossEntropy(x pred ,x gold )
minimizing Loss and updating model weights, training until Loss no longer drops.
During verification, calculating accuracy of model translation by using BLEU score:
wherein p is n For the prediction result x pred In predicting the correct n-gram proportion, i.e
BP is a penalty factor, penalty when predicting result x pred Length ratio x of (2) gold The length is small:
after obtaining the round of model with the best verification result as the final model NPTrans, D can be used test Testing was performed.
Further, during actual translation, a noun set and a noun translation set in a text to be translated are constructed through the noun recognition module and the noun query module, and an input text of a translation model and an adjustment matrix M of the translation model are further constructed through the data; and translating the input text of the translation model by using the trained final model NPTrans, adjusting the attention calculation of the model by using an adjusting matrix, and finally outputting the translation.
Compared with the prior art, the embodiment of the invention trains the translation model by constructing the training set and the adjusting matrix containing the noun translation prompts in advance, so that the model learns the internal relation between nouns and noun translations, and the accuracy of the translation model on noun translation based on prompts is improved; the attention calculation of the model is regulated by constructing the regulating matrix, so that the model does not calculate the attention between noun translation and other original characters any more, only the attention between noun translation and noun in the original text is calculated, and the accuracy of the model is improved; by constructing input data containing noun translation prompts and an adjusting matrix, a translation model can accurately translate nouns, and the problems of inaccurate noun and proper noun translation, missing and wrong turning of the existing translation model are solved.
Those skilled in the art will appreciate that all or part of the flow of the methods of the embodiments described above may be accomplished by way of a computer program to instruct associated hardware, where the program may be stored on a computer readable storage medium. Wherein the computer readable storage medium is a magnetic disk, an optical disk, a read-only memory or a random access memory, etc.
The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention.

Claims (9)

1. A translation model construction method based on noun translation prompt is characterized by comprising the following steps:
acquiring parallel corpus data of two languages to be translated to obtain a data set D;
identifying the original text and the translated text of nouns in each piece of data in the data set D to obtain an original text noun set S of each piece of data word And translated noun set S word-trans
Training sample X of all data in D is obtained through data construction input Adjustment matrix M corresponding to all data train Wherein X is input =[x 1 ,x 2 ,…,x g ],M train =[M 1 ,M 2 ,…,M g ]Single training sample x i Is to increase the text x after noun translation prompt input And target translation x gold I.e. [1,2, …, g)]G is the number of data bars;
the adjustment matrix M train Importing a translation model to be trained, and using the training sample X input Training the model to obtain a final trained translation model NPTrans, comprising:
the adjustment matrix M is introduced by means of the following function calculation train Attention of the posterior model:
wherein Q is i 、K i 、V i Is to calculate x i Query, key, value matrix, d when attentive ki Is Q i Or K i Is a dimension of (2);
calculating the prediction result x using the following function pred And target result x gold Loss between:
Loss=CrossEntropy(x pred ,x gold )
minimizing Loss and updating model weights, and training until Loss is no longer reduced;
calculating the accuracy of model translation using the following function:
wherein p is n For the prediction result x pred The correct n-gram ratio is predicted and BP is the penalty factor.
2. The translation model construction method according to claim 1, wherein the data construction comprises the steps of:
carrying out data cleaning on the parallel corpus data to obtain a cleaned data set D;
the original text of each data in the data set D is sequentially spliced with noun translation set S word-trans Translated version of all nouns in the translation model to obtain input text X of the translation model input
Constructing each text x after the addition of noun translation prompts input List of corresponding positional relationships of (a) index According to the List index Determining the value M of the element in the construction adjustment matrix ij Inserting special symbols into the start and stop rows to obtain the single training sample x i Corresponding adjustment matrix M g Thereby obtaining an adjustment matrix M corresponding to all the data train
3. The method for constructing a translation model according to claim 2, wherein each text x after the increasing noun translation hint is constructed input List of corresponding positional relationships of (a) index The method comprises the following steps:
inputting each piece of input text x of the translation model input Each pair of nouns and nouns in (a)Translation is x input Is represented by a pair of tuples;
each noun-translation position tuple pair forms a sub-list;
connecting all noun-translation position tuple pairs to the sub-List to form a List of the corresponding position relation of the text index
4. A translation model construction method according to claim 2 or 3, characterized in that the adjustment matrix element M ij The values and constraints of (2) are as follows:
wherein len (x) i0 ) Original text x representing a single training sample after cleaning and before adding translation cues i0 Length of len (List) index ) Representing the List index Length of (i.e. number of sub-lists), list index [z][0]Representing List index The first tuple in the z-th sub-List, list index [z][1]Representing List index The second tuple in the z-th sub-list.
5. The method according to claim 1, wherein the adjustment matrix M is train Importing a translation model, comprising:
the maximum length L which can be input and preset according to the model max Expanding the adjustment matrix M to the right and downwards to a size L by adding 0 value elements max ×L max Obtaining M train ’;
Will M train ' import coding layer of the translation model.
6. The method according to claim 1, characterized in that noun set S included in the data set D is identified word Comprising:
marking nouns in the original text of each piece of data in the data set D by using a marking tool with built-in part of speech; or alternatively
And performing noun recognition on the original text in the data set D by using a noun recognition model trained according to requirements.
7. The method according to claim 1 or 6, wherein the noun set S is obtained word The noun translations corresponding to all nouns in (a) include:
by querying dictionary subject noun Acquiring the original noun set S word Translation w of noun w to be matched in trans
Searching whether the translation w exists in the translations of the parallel corpus in the data set D trans Matched words, if present, will w trans Adding the translated noun set S word-trans If there is no word set S of w from the original text word And deleted.
8. The method of claim 7, wherein the obtaining a translation w of a noun primitive w in the dataset D trans Comprising:
1) Acquiring the original noun set S word To be matched in the word w;
2) Directly inquiring dictionary by taking w as key name noun If the corresponding value exists, the value is directly taken as a translation, and if the corresponding value exists, the next step is carried out;
3) Calculating the noun w to be matched and dictionary part noun All of (3) key= { key 1 ,key 2 ,…,key x Similarity score of } to obtain a score set s= { S 1 ,s 2 ,…,s x X is a subject noun Length of (2)
4) Finding the element position with the maximum value in the score set S, and randomly taking one element as the maximum value element if the number of the maximum elements in the score set S is greater than 1;
5) According to takingFinding out the corresponding dictionary part from the element position with the maximum value noun Key-to-key in (a) max And value max Using value max As translations.
9. A method of constructing a translation model according to claim 1, characterized in that the translation model to be trained is constructed based on a transformers framework, comprising an encoder and a decoder, both comprising multiple layers of the same self-attention residual structure.
CN202211348033.6A 2022-10-31 2022-10-31 Translation model construction method based on noun translation prompt Active CN115688904B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211348033.6A CN115688904B (en) 2022-10-31 2022-10-31 Translation model construction method based on noun translation prompt

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211348033.6A CN115688904B (en) 2022-10-31 2022-10-31 Translation model construction method based on noun translation prompt

Publications (2)

Publication Number Publication Date
CN115688904A CN115688904A (en) 2023-02-03
CN115688904B true CN115688904B (en) 2023-07-18

Family

ID=85047111

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211348033.6A Active CN115688904B (en) 2022-10-31 2022-10-31 Translation model construction method based on noun translation prompt

Country Status (1)

Country Link
CN (1) CN115688904B (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102449842B1 (en) * 2017-11-30 2022-09-30 삼성전자주식회사 Method for training language model and apparatus therefor
CN114091481A (en) * 2020-08-24 2022-02-25 四川医枢科技股份有限公司 Medical machine translation method based on sentence translation keywords
CN114925708A (en) * 2022-05-24 2022-08-19 昆明理工大学 Thai Chinese neural machine translation method fusing unsupervised dependency syntax
CN115017923A (en) * 2022-05-30 2022-09-06 华东师范大学 Professional term vocabulary alignment replacement method based on Transformer translation model

Also Published As

Publication number Publication date
CN115688904A (en) 2023-02-03

Similar Documents

Publication Publication Date Title
US10268685B2 (en) Statistics-based machine translation method, apparatus and electronic device
EP0830668B1 (en) Systems and methods for word recognition
CN109800414B (en) Method and system for recommending language correction
CN105068997B (en) The construction method and device of parallel corpora
CN112580373B (en) High-quality Mongolian non-supervision neural machine translation method
CN111553159B (en) Question generation method and system
CN113268576B (en) Deep learning-based department semantic information extraction method and device
CN112035652A (en) Intelligent question-answer interaction method and system based on machine reading understanding
CN112329482A (en) Machine translation method, device, electronic equipment and readable storage medium
CN115034218A (en) Chinese grammar error diagnosis method based on multi-stage training and editing level voting
Yuan Grammatical error correction in non-native English
CN113221542A (en) Chinese text automatic proofreading method based on multi-granularity fusion and Bert screening
CN112417823A (en) Chinese text word order adjusting and quantitative word completion method and system
CN116629277A (en) Medical machine translation method based on reinforcement learning
CN115658898A (en) Chinese and English book entity relation extraction method, system and equipment
CN115062634A (en) Medical term extraction method and system based on multilingual parallel corpus
CN114492396A (en) Text error correction method for automobile proper nouns and readable storage medium
CN115906878A (en) Machine translation method based on prompt
Foster Text prediction for translators
CN115860015B (en) Translation memory-based transcription text translation method and computer equipment
Villegas et al. Exploiting existing modern transcripts for historical handwritten text recognition
CN115688904B (en) Translation model construction method based on noun translation prompt
CN115310433A (en) Data enhancement method for Chinese text proofreading
CN114185573A (en) Implementation and online updating system and method for human-computer interaction machine translation system
CN111090720B (en) Hot word adding method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant