CN111274826B - Semantic information fusion-based low-frequency word translation method - Google Patents

Semantic information fusion-based low-frequency word translation method Download PDF

Info

Publication number
CN111274826B
CN111274826B CN202010060672.7A CN202010060672A CN111274826B CN 111274826 B CN111274826 B CN 111274826B CN 202010060672 A CN202010060672 A CN 202010060672A CN 111274826 B CN111274826 B CN 111274826B
Authority
CN
China
Prior art keywords
low
frequency
words
vector
vector representation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010060672.7A
Other languages
Chinese (zh)
Other versions
CN111274826A (en
Inventor
张学强
董晓飞
曹峰
石霖
孙明俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing New Generation Artificial Intelligence Research Institute Co ltd
Original Assignee
Nanjing New Generation Artificial Intelligence Research Institute Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing New Generation Artificial Intelligence Research Institute Co ltd filed Critical Nanjing New Generation Artificial Intelligence Research Institute Co ltd
Priority to CN202010060672.7A priority Critical patent/CN111274826B/en
Publication of CN111274826A publication Critical patent/CN111274826A/en
Application granted granted Critical
Publication of CN111274826B publication Critical patent/CN111274826B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a low-frequency word translation method based on semantic information fusion, which belongs to the field of machine translation and is characterized in that a bilingual sentence pair is input into a translation system, wherein a source language sentence X and a target language sentence y corresponding to the source language sentence are used for obtaining a sub-word sequence of low-frequency words in the source language sentence, obtaining a target translation corresponding to the low-frequency words in the target language sentence, and obtaining a new bilingual sentence pair after replacing the low-frequency words in the bilingual sentence pair (X, y) by using a wildcard character UNKi
Figure DDA0002374353150000011
And fusing the vector representation of the source language low-frequency words and/or the target language low-frequency words with the vector representation of the wildcard character UNKi. The invention tightly deducts the core thought of semantic fusion, provides three semantic fusion specific forms of source language low-frequency word vector representation, target language low-frequency word vector representation and two-end low-frequency word vector representation, and fully utilizes the vector of the low-frequency word in two languages and two vector spacesTo represent semantic information for low frequency words.

Description

Semantic information fusion-based low-frequency word translation method
Technical Field
The invention relates to the field of machine translation, in particular to a low-frequency word conversion task in a neural machine translation system, and the semantic vector representation of low-frequency words at a source end and a target end is fully utilized in the model training and decoding processes, so that the translation quality of the low-frequency words and even the whole sentence is improved.
Background
Low-frequency words refer to words that are sub-sparse or never occur in a large-scale bilingual parallel corpus. Depending on the frequency, it is also commonly referred to as Unknown words (Unknown words) or Out-of-vocabulary words (Out of vocabularies) in natural language processing. Because of the characteristics of frequency sparsity, translation unicity and the like, low-frequency word translation is always the key and difficult point in the research work of machine translation. Particularly, in the current mainstream neural machine translation, the vocabulary is limited, the modeling process depends on vector representation, and the problem of low-frequency word translation is increasingly emphasized by academic circles and industrial circles.
With the further development of globalization trend, machine translation becomes an important research topic facing the interactive communication of different language ethnic groups. The quality of the low-frequency word translation effect directly influences whether the machine translation technology and the application can be successfully applied to the practicability and industrialization. The traditional low-frequency word processing method mainly has two types: first, a subword segmentation method represented by Byte Pair Encoding (BPE) achieves the purpose of reducing modeling units by further segmenting words into subwords. Secondly, converting the low-frequency words into wildcards, and replacing the wildcards with the target low-frequency words after translation to form a final complete translation. However, for the second method, directly replacing the low-frequency words with wildcards may cause incomplete semantic information of the sentences, further resulting in the problems of incompleteness, fluency and the like of the translations generated by decoding. The invention provides a low-frequency word translation method based on semantic fusion, which fuses vector representations of two parts of low-frequency words and wildcards in the process of training and decoding a neural machine translation model so as to simultaneously and explicitly consider semantic information of the low-frequency words and the wildcards, thereby effectively improving the accuracy and the fluency of translated texts.
The low-frequency word translation always is a problem to be solved urgently from the development history of machine translation from machine translation based on rules to machine translation based on statistics to machine translation based on deep learning. As described above, the processing of low frequency words derives two broad categories: one method for generating sub-word units with smaller granularity is based on sub-word segmentation by counting the occurrence frequency of sub-words in large-scale corpus, and the typical method of this category is Byte Pair Encoding (BPE). Secondly, from the replacement angle, translating after using wildcards to express nouns or noun phrases in sentences, replacing the special marks with target low-frequency words in the editing process after translating the texts, wherein the typical method of the category is a wildcards replacement translation method.
The low-frequency word translation method based on the sub-words comprises the following steps:
the method is based on a counting model, and selects N words, sub-words or characters with high frequency as a modeling unit on the premise of limiting the size of a word list through neural machine translation. And the rest words or phrases are combined and expressed by adopting the modeling unit. There are mainly two typical methods:
the method comprises the following steps: word model modeling
The word model is a model using a word as a modeling unit. In natural language, the more the upper level unit has rich and various expression forms, and the more the lower level unit has a relatively single form. Like the line, surface and surface in mathematics, the characters in natural language constitute words, words constitute phrases and phrases constitute sentences. Statistically, although the total number of Chinese characters exceeds 8 ten thousand, the number of commonly used Chinese characters is only about 3500, and it is enough to combine thousands of words or phrases. Therefore, this method is often used in the field of machine translation where the number of modeling units is severely limited. In the end-to-end neural machine translation, the effect is better than that of a modeling mode taking a word as a unit on the whole, and the method is widely applied once.
The second method comprises the following steps: byte pair encoding
Byte pair encoding is a data compression method proposed by Gage et al, 1994, the idea being to recursively use a single, unused byte to represent the most frequently co-occurring byte pair in a sentence sequence. Similarly, the method is applied to Chinese sub-word segmentation, namely, the first N pairs of Chinese characters with higher co-occurrence frequency in a Chinese sentence are used as modeling units. For example, for the word "robot," the frequency of co-occurrence of "machine" and "machine" is typically high in a large-scale corpus, while the frequency of co-occurrence of three words "machine", "machine" and "human" may be relatively low. At this time, the byte pair encoding method divides the "robot" into sub-words "machine" and "human" as two different modeling units, respectively. In end-to-end neural machine translation, the effect of the word joint modeling mode is generally better than that of a single word or word unit modeling mode.
The low-frequency word translation method based on replacement comprises the following steps:
the method comprises the following steps: word in set replacement
The core idea of the intra-word replacement method is that the intra-word with the highest frequency and most similar to the low-frequency words in the large-scale corpus is adopted to replace the low-frequency words. According to the realization principle of the current mainstream neural machine translation method, a word list with fixed dimensionality needs to be generated in advance, and the method usually adopted is to count all M words appearing in large-scale linguistic data
Figure GDA0002445555890000021
Frequency of (2)
Figure GDA0002445555890000022
The first N words in descending order are selected according to the word frequency to form a word list (W)N. At this point we will include the words in the vocabulary
Figure GDA0002445555890000031
Called words in the set, correspondingly takes the rest M-N words
Figure GDA0002445555890000032
Called an out-of-set word. The general method of the intra-set word replacement method is to match an intra-set word with the most similar semanteme for each out-set word by calculating the vector distance between word vectors. In the model training and decoding process, all the out-of-set words which are difficult to process are converted into the in-set words, and the target translation of the out-of-set words is only converted back into the translation after decoding, so that the aim of solving the translation of the low-frequency words is fulfilled.
The second method comprises the following steps: low frequency word class replacement
The first method has the advantages that the alternative words in the set with the most similar semanteme can retain the meaning of the source language sentence to the maximum extent, and the first method has the defect that in the attention neural machine translation of soft alignment between the source language sentence and the target language sentence, the position of the alternative words in the translated text is difficult to clearly replace, so that the target translated text of the out-of-set words is difficult to replace. One way to solve this problem is to replace it with a category of out-of-set words as wildcards. For example, the names of people in a bilingual pair are typically replaced with "$ _ person" as a wildcard, and the place and organizational names are replaced with "$ _ location", "$ _ organization", respectively. And finally, replacing the category symbols with target translations of low-frequency words such as the names of people, places, organizations and the like to finish the translation process. The method has the advantages that the special wildcards can be remained in the target translation without change, and the final translation is convenient to be changed back. The method has the defects of sensitivity to low-frequency word types and easy disorder in the post-processing and replacing process of the translated text when a sentence contains a plurality of similar low-frequency words.
The third method comprises the following steps: UNKi replacement
To alleviate the problems of the second method, an unti replacement method is proposed. The alternative principle of this method is not to recognize the type of the low-frequency word, but to replace the low-frequency word in the sentence with the wildcard UNKi (i ═ 1,2,3 …) uniformly. The method not only avoids the problem of inconsistency of the low-frequency words and the context caused by the low-frequency word type recognition error, but also solves the problem of order replacement of the low-frequency words in the translation process.
In addition, there are some low-frequency word processing methods that jointly use a subword segmentation and replacement mechanism. On the basis of sub-word segmentation, sub-words with lower frequency are further replaced, and therefore better translation performance is obtained. The invention provides a method for fusing low-frequency words and UNKi wildcard character vector representation innovatively on the basis of jointly adopting a subword segmentation and UNKi replacement method so as to effectively improve the translation effect of the low-frequency words and even the whole sentence.
The prior art has the following disadvantages:
in a machine translation system, especially in an end-to-end neural machine translation system, it is a feasible approach to adopt a statistical method to segment sub-words, or adopt an alternative method to process low-frequency words. However, after the low-frequency words are replaced by the existing replacement scheme, the influence of the low-frequency word information on semantic coding and representation of wildcard context is not considered any more, and the problems that although the low-frequency words can be translated, the low-frequency words are not smoothly linked with the context in the translated text and the like exist, namely the fluency of the translated text is reduced.
Disclosure of Invention
The invention provides a novel low-frequency word solution aiming at a low-frequency word replacement method based on a wildcard character UNKi. In the encoding process of a source language sentence, vector representation of source language low-frequency words and/or target language low-frequency words and vector representation of a wildcard character UNKi are fused so as to improve the translation effect of the low-frequency words and the context thereof. In addition, considering that the current neural machine translation system commonly applies two methods of sub-word segmentation and replacement mechanism jointly to solve the problem of low-frequency word translation, the low-frequency word is generally segmented into a sub-word sequence. Therefore, the invention also provides a low-frequency word vector coding method based on a long-time memory (LSTM) network, so as to respectively obtain complete semantic vector representations of a plurality of sub-words contained in the low-frequency words of the source language and the target language. And finally, fusing vector representations of the low-frequency words of the source end and the target end with a wildcard character UNKi to improve the translation performance of the low-frequency words and even the full text.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows: a low-frequency word translation method based on semantic information fusion is characterized by comprising the following steps: inputting bilingual sentence pairs in a translation systemIn the method, a source language sentence X and a target language sentence y corresponding to the source language sentence are obtained, a sub-word sequence of low-frequency words in the source language sentence is obtained, a target translation corresponding to the low-frequency words in the target language sentence is obtained, a wildcard character UNKi is adopted to replace the low-frequency words in a bilingual sentence pair (X, y), and a new bilingual sentence pair is obtained
Figure GDA0002445555890000041
And fusing the vector representation of the source language low-frequency words and/or the target language low-frequency words with the vector representation of the wildcard character UNKi.
Further, obtaining a sub-word sequence of low-frequency words to be processed in the source language sentence through word frequency statistics; for bilingual low-frequency words, certain difficulty exists in modeling due to the fact that the internal statistical rule of the bilingual low-frequency words is relatively fuzzy and the occurrence frequency of the bilingual low-frequency words in training corpus is very low. Therefore, the method for preferentially using the dictionary searching for the translation of the low-frequency words is adopted, and if the searching fails, the translation is carried out by adopting the low-frequency word translation model.
And the dictionary lookup is to find a translation corresponding to the word in a dictionary lookup mode in the translation process. The advantages of the dictionary lookup method are: a word list of low-frequency words (entities, terms, proper nouns and the like) can be constructed in advance according to specific application scenes and fields of machine translation. As long as the low-frequency words to be searched hit the word list, the returned translation is ensured to be completely correct and to be in line with a specific situation. The method has the defect that the hit rate is difficult to ensure in the searching process depending on the scale of word list construction.
The translation model is a supplement to the dictionary lookup method, and has the advantage that the target translation with the highest probability is output for the given low-frequency words to be translated. The invention preferentially adopts the character model to solve the problems of short low-frequency words and low frequency. The word model refers to that in the translation process, words are used as processing units, but words are not used as processing units.
Further, the translation method for fusing the vector representation of the source language low-frequency word and the vector representation of the wildcard UNKi comprises the following steps:
step one, taking out the replaced source language sentence
Figure GDA0002445555890000051
Vector characterization of all words in
Figure GDA0002445555890000052
Step two, taking out the vector representation of the low-frequency words in the source language sentence, and coding the vector representation by adopting the LSTM to obtain the vector representation
Figure GDA0002445555890000053
Step three, weighted summation and nonlinear transformation are respectively carried out on the wildcard vector representation of the low-frequency words and the source language sub-word sequence vector representation, and the final vector representation V of the low-frequency words is obtainedi
Figure GDA0002445555890000054
Wherein tanh is a hyperbolic tangent function,
Figure GDA0002445555890000055
is a vector characterization of wildcards, Wu,unkiIs LSTM encoder based on specific context semantic environment pairs
Figure GDA0002445555890000056
Weight of (1), Ws,unkiIs LSTM encoder based on specific context semantic environment pairs
Figure GDA0002445555890000057
The weight of (c).
The translation method for fusing the vector representation of the low-frequency word of the target language and the vector representation of the wildcard UNKi comprises the following steps:
step one, taking out the replaced source language sentence
Figure GDA0002445555890000058
Vector characterization of all words in
Figure GDA0002445555890000059
And step two, taking out the vector representation of the low-frequency words in the target language sentence, and coding the vector representation by adopting LSTM to obtain the vector representation
Figure GDA00024455558900000510
Step three, weighted summation and nonlinear transformation are respectively carried out on the wildcard vector representation of the low-frequency words and the target language sub-word sequence vector representation, and the final vector representation V of the low-frequency words is obtainedj
Figure GDA00024455558900000511
Wherein tanh is a hyperbolic tangent function,
Figure GDA00024455558900000512
is a vector characterization of wildcards, Wu,unkjIs LSTM encoder based on specific context semantic environment pairs
Figure GDA00024455558900000513
Weight of (1), Wt,unkjIs LSTM encoder based on specific context semantic environment pairs
Figure GDA00024455558900000514
The weight of (c).
The translation method for fusing the vector representation of the source language low-frequency words and the target language low-frequency words with the vector representation of the wildcard UNKi comprises the following steps:
step one, taking out the replaced source language sentence
Figure GDA0002445555890000061
Vector characterization of all words in
Figure GDA0002445555890000062
Step two, extracting the vector representation of low-frequency words in the source language sentences and the target language sentencesRespectively adopting LSTM to code the vector characterization
Figure GDA0002445555890000063
And
Figure GDA0002445555890000064
step three, weighted summation and nonlinear transformation are respectively carried out on the wildcard vector representation of the low-frequency words, the source language sub-word sequence vector representation and the target language sub-word sequence vector representation to obtain the final vector representation V of the low-frequency wordsm
Figure GDA0002445555890000065
Wherein tanh is a hyperbolic tangent function,
Figure GDA0002445555890000066
is a vector characterization of wildcards, Wu,unkmIs LSTM encoder based on specific context semantic environment pairs
Figure GDA0002445555890000067
Weight of (1), Ws,unkmIs LSTM encoder based on specific context semantic environment pairs
Figure GDA0002445555890000068
Weight of (1), Wt,unkmIs LSTM encoder based on specific context semantic environment pairs
Figure GDA0002445555890000069
The weight of (c).
In the specific implementation process of fusing the low-frequency word vectors at the two ends of the source end and the target end, the invention mainly comprises the following three advantages:
firstly, in the aspect of low-frequency word vector sources to be fused, vector representation of low-frequency word meaning information in two vector spaces of a source language and a target language is utilized simultaneously.
Secondly, in the aspect of multi-subword vector coding of low-frequency words, a long-time memory (LSTM) network is adopted to reversely scan a sequence of subwords, so that the first subword occupies important information in the final vector representation.
Thirdly, in the aspect of low-frequency word semantic fusion, a hyperbolic tangent function tanh is adopted as an Activation function (Activation functions) to fuse the three-part vectors, which is equivalent to performing nonlinear transformation after weighting and summing the three-part vectors.
Compared with the prior art, the low-frequency word processing method based on semantic fusion provided by the invention has the following advantages:
1. the method can fuse the vector representation of the low-frequency words and the wildcards in an Encoder module translated by an end-to-end neural machine, so that more accurate and sufficient information is coded into an intermediate vector to promote an Attenttion module and a Decoder module to generate a translation with higher quality.
2. The method can be used for solving the problem of low-frequency word translation, and is also beneficial to improving the translation performance of common named entities, complex noun phrases and professional terms in natural language sentences.
3. The core idea of semantic fusion is fastened, three semantic fusion specific forms of source language low-frequency word vector representation, target language low-frequency word vector representation, two-end low-frequency word vector representation and the like are provided, and the vectors of the low-frequency words in two languages and two vector spaces are fully utilized to represent the semantic information of the low-frequency words.
4. The semantic fusion method provided by the invention can ensure that on the premise of ensuring higher frequency of the sub-words in the low-frequency words, better sub-word vector representation is obtained through model learning by a sub-word segmentation and recoding mode, so that complete vector representation of the low-frequency words is obtained through LSTM coding sub-word vector representation.
5. In the semantic fusion method provided by the invention, a method for obtaining complete vector representation of low-frequency words by reversely coding low-frequency word sub-word sequences by using LSTM is adopted, so that the problem of inaccurate vector representation caused by sparse low-frequency words is solved. Therefore, the low-frequency word translation method based on semantic fusion is suitable for a neural machine translation method taking whole words, sub-words or characters as modeling units.
Drawings
FIG. 1 is a diagram of a neural network translation model based on RNN and Attention in this embodiment.
Fig. 2 is a schematic diagram of a semantic fusion process of a vector representation of a source language low-frequency word and a vector representation of a wildcard UNKi in this embodiment.
Fig. 3 is a schematic diagram of a semantic fusion process of the vector representation of the low-frequency word in the target language and the vector representation of the wildcard UNKi in this embodiment.
Fig. 4 is a schematic diagram of the hyperbolic tangent function tanh in this embodiment.
Fig. 5 is a schematic diagram of a semantic fusion process of the vector representations of the source language low-frequency words and the target language low-frequency words and the vector representation of the wildcard UNKi in this embodiment.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the flow of the neural machine translation system is described by taking a translation system based on a Recurrent Neural Network (RNN) and Attention mechanism (Attention) as an example, and then the framework is used as an example to describe how to effectively fuse the vector representations of the source language low-frequency words, the target language low-frequency words and the vector representation of the wildcard UNKi. It should be noted that the present invention can also be extended to other neural network translation systems, such as a Convolutional Neural Network (CNN) based translation system, and a full attention mechanism based translation system.
RNN and Attention based translation System descriptions:
as shown in fig. 1, a schematic diagram of neural network translation model based on RNN and Attention inputs a source language sentence x ═ (x) to be translated1,x2,x3,...,xm) The output is the target language sentence y ═ y (y)1,y2,y3,...,yn) Where the source and target language sentences are m and n in length, respectively. The whole translation frame of the system is divided into three modules, namely an Encoder module based on bidirectional RNN, an Attention module and a Decoder module based on RNN, wherein each part of the Encoder module, the Attention module and the Decoder module is divided into three partsThe specific process is described as follows:
encoder module flow:
the module is used to calculate the token code of each word in the input source language sentence in the context of the sentence. Given a source language sentence x ═ x1,x2,x3,...,xT) Firstly, loading pre-trained or randomly initialized word vectors, and obtaining each word x by a word vector table look-up technologyiCorresponding vector representation viThen, a representation f under the condition that each word sees historical vocabulary information is obtained through a forward recurrent neural network based on the word vector representationiObtaining the representation b of each word seeing future vocabulary information through a reverse circulation neural networkiFinally, the two are spliced together [ fi:bi]Form a token vector h of each word in the sentencei. Here the recurrent neural network can be either the normal RNN and its modified structure GRU or LSTM. The information of each word in a given sentence context can be better represented because the calculation of its token vector uses both forward history information and reverse future information.
Attention module flow:
the module is used for calculating the source language sentence information representation c depended on by the ith decoding timei. Assume that the last time RNN decoded implicit state is si-1Then c isiThe calculation is described in detail as follows:
Figure GDA0002445555890000081
Figure GDA0002445555890000082
wherein, a(s)i-1,hj) Is a variable si-1And hjA general function can be realized in a plurality of ways, and a simple and classical realization form is as follows:
a(si-1,hj)=vTtanh(Wsi-1+Uhj)
therefore, the semantic information representation of the source language sentence generated at the ith decoding time is a weighted average of each word of the source language, and the weighting coefficient determines the attention degree of each original word at the current time.
Decoder module flow:
the module dynamically generates a vector representation c of the source language sentence based on each timeiAnd the state s of the decoder at the previous momenti-1And adopting a recurrent neural network to generate the target language sentence. The specific calculation method is as follows:
si=f(xi-1,yi-1,ci)
Figure GDA0002445555890000083
wherein, f (·) represents a transformation function when the RNN is implemented, and the RNN can be a common structure, and can also be a GRU or LSTM structure added with a gating mechanism. P (y)i=Vk) Denotes yiIs the probability of the kth word in the target language vocabulary, bk(si) Is represented according to bk(. cndot.) is the transformation function associated with the kth target word. After the word probability calculation on the target language word list is completed at each decoding moment, the final decoding sequence y ═ can be obtained through the Beam Search algorithm (y ═1,y2,y3,...,yn) The output probability P (y | x) of the entire sentence is maximized.
The low-frequency word translation method based on semantic fusion is explained as follows:
assume that a set of bilingual sentence pairs is input in the end-to-end neural machine translation system described above, where the source language sentence x ═ x (x)1,x2,x3,x4,x5,x6,x7,x8,x9,...,xm) The corresponding target language sentence y ═ y (y)1,y2,y3,y4,y5,y6,y7,y8,y9,...,yn). Obtaining a sub-word sequence S in the source language sentence through word frequency statisticsunk1(x2,x3) And Sunk2(x6,x7,x8) Is a low-frequency word to be processed, and obtains a target translation T corresponding to the two low-frequency words in a target language sentence in a dictionary lookup or model translation modeunk1(y2,y3,y4) And Tunk2(y7,y8,y9). After the wildcard character UNKi is used to replace the low-frequency words in the bilingual sentence pair (x, y), a new bilingual sentence pair is obtained
Figure GDA0002445555890000091
Wherein
Figure GDA0002445555890000092
Figure GDA0002445555890000093
At the moment, the purpose of semantic fusion of the low-frequency words is to perform source language low-frequency words S in the training process of a neural network translation modelunk1(x2,x3) And Sunk2(x6,x7,x8) With target language low-frequency word Tunk1(y2,y3,y4) And Tunk2(y7,y8,y9) Vector characterization and wildcard u of1And u2The vector representations of (a) are fused. According to the invention, the fusion method is divided into three types according to the fact that the low-frequency word vector to be fused comes from the source language or the target language: respectively, a Source-end low-frequency word vector (Source), a Target-end low-frequency word vector (Target) and two-end low-frequency word vectors (Source + Target). In the three schemes, both the Target scheme and the Source + Target scheme applied to the Target low-frequency word need to obtain a low-frequency word translation in advance, and the low-frequency word translation method is firstly explained and then three fusion methods are respectively explained as follows:
low frequency word translation
The invention mainly adopts a strategy of combining two schemes of searching word lists and model translation for the translation of low-frequency words. For bilingual low-frequency words, certain difficulty exists in modeling due to the fact that the internal statistical rule of the bilingual low-frequency words is relatively fuzzy and the occurrence frequency of the bilingual low-frequency words in training corpus is very low. Therefore, the method for preferentially using the dictionary lookup for the translation of the low-frequency words is adopted, and if the lookup fails, the customized low-frequency word translation model is adopted for translation.
The dictionary lookup method has the advantage that a low-frequency word (entity, term, proper noun and the like) vocabulary can be constructed in advance according to the specific application scene and field of machine translation. As long as the low-frequency words to be searched hit the word list, the returned translation is ensured to be completely correct and to be in line with a specific situation. The method has the defect that the hit rate is difficult to ensure in the searching process depending on the scale of word list construction.
The customized model translation is a supplement to the dictionary lookup method, and has the advantage that the target translation with the highest probability is output for the given low-frequency words to be translated. The invention adopts the character model to solve the problems of short low-frequency words and low frequency. Taking Nanjing Yangtze River Bridge and its translation "Nanjing Yangtze River Bridge" as an example, the bilingual forms under the character model are Nanjing Yangtze River Bridge and [ N a N j i N g ] [ Y a N g t z e ] [ R i v e R ] [ B R i d g e ]. The frequency of a modeling unit in the training data is improved by utilizing the word model, so that the translation performance of the low-frequency words is greatly improved.
Fusion source-side low-frequency word vectors
As shown in fig. 2: the method is mainly characterized in that an encoder fuses vector representation of source language low-frequency words and vector representation of wildcards in the training and decoding process of a neural network translation model.
The low-frequency word translation method integrated with the source-end vector is mainly embodied in the encoding process of neural machine translation, and mainly comprises the following three steps: step one, taking out the replaced source language sentence
Figure GDA0002445555890000101
All ofVector characterization of words
Figure GDA0002445555890000102
It should be noted that the word vectors may be obtained through model pre-training or may be initialized randomly according to a certain distribution, and the difference in final effect between the two methods is not obvious because the word vectors are updated in real time during the training of the neural network translation model. Step two, taking out low-frequency words S in sentencesunk1(x2,x3) And Sunk2(x6,x7,x8) Is characterized by the vector of Sunk1(v2,v3) And Sunk2(v6,v7,v8) Respectively adopting LSTM to code the vector characterization
Figure GDA0002445555890000103
And
Figure GDA0002445555890000104
step three, respectively aligning the low-frequency words Sunk1Wildcard vector characterization of
Figure GDA0002445555890000105
And sub-word sequence vector characterization
Figure GDA0002445555890000106
Carrying out weighted summation and nonlinear transformation to obtain Sunk1The final vector of (2) characterizes V1
Figure GDA0002445555890000107
In the same way, the low-frequency word S can be obtainedunk2Fused vector representation V2
Figure GDA0002445555890000108
After the three steps, the low-frequency word translation method fusing the source-end vector is realized.
Due to S in low frequency wordsunk1And Sunk2The final vector of (2) characterizes V1And V2The method includes semantic information from history and future contexts in wildcards and semantic information of the low-frequency words, so that vector representation generated by bidirectional RNN coding
Figure GDA0002445555890000111
And
Figure GDA0002445555890000112
more accurate and more sufficient information is contained. Further, the method plays an active role in the Attention module and the Decoder module to generate more accurate low-frequency word translation and target translation.
Fuse target-side low-frequency word vectors
As shown in fig. 3: similar to the method, the main idea of the method for fusing the target-end low-frequency word vector is to fuse the vector representation of the target language low-frequency word and the vector representation of the wildcard by adopting an encoder in the training and decoding process of a neural network translation model.
The low-frequency word translation method for integrating the target end vector mainly completes the following three steps in the coding process of neural machine translation: first, the replaced source language sentence is fetched
Figure GDA0002445555890000113
Vector characterization of all words in
Figure GDA0002445555890000114
Second, low frequency words T in the sentence are extractedunk1(y2,y3,y4) And Tunk2(y7,y8,y9) The vector of (1) characterizes Tunk1(v2,v3,v4) And Tunk2(v7,v8,v9) Respectively adoptEncoding the LSTM to obtain vector characterization
Figure GDA0002445555890000115
And
Figure GDA0002445555890000116
thirdly, respectively aligning the low frequency words Tunk1Wildcard vector characterization of
Figure GDA0002445555890000117
And sub-word sequence vector characterization
Figure GDA0002445555890000118
Carrying out weighted summation and nonlinear transformation to obtain Tunk1The final vector of (2) characterizes V1
Figure GDA0002445555890000119
In the same way, the low-frequency word T can be obtainedunk2Fused vector representation V2
Figure GDA00024455558900001110
After the three steps, the low-frequency word translation method fusing the target end vector is realized.
Fuse both end low frequency word vectors
As shown in fig. 5, a semantic fusion process diagram of vector representations of source language low-frequency words and target language low-frequency words and a vector representation of a wildcard character UNKi is shown, and a method for fusing vectors of low-frequency words at two ends integrates the forms of the two methods, which is also a key point of the present invention. The low-frequency word translation method integrated with vectors at two ends mainly completes the following three steps in the coding process of neural machine translation: first, take out and replaceThe latter source language sentence
Figure GDA00024455558900001111
Vector characterization of all words in
Figure GDA00024455558900001112
Second, the low-frequency words S in the source language in the sentence are taken outunk1(x2,x3) And Sunk2(x6,x7,x8) Vector characterization of
Figure GDA0002445555890000121
And
Figure GDA0002445555890000122
respectively adopting LSTM to code the LSTM to obtain vector characterization
Figure GDA0002445555890000123
And
Figure GDA0002445555890000124
then, the low-frequency words T of the target language in the sentence are taken outunk1(y2,y3,y4) And Tunk2(x7,x8,x9) Vector characterization of
Figure GDA0002445555890000125
And
Figure GDA0002445555890000126
respectively adopting LSTM to code the LSTM to obtain vector characterization
Figure GDA0002445555890000127
And
Figure GDA0002445555890000128
thirdly, respectively aligning the low frequency words Tunk1Wildcard vector characterization of
Figure GDA0002445555890000129
And source language sub-word sequence directionQuantity characterisation
Figure GDA00024455558900001210
Target language sub-word sequence vector characterization
Figure GDA00024455558900001211
Summing to obtain Tunk1The final vector of (2) characterizes V1
Figure GDA00024455558900001212
In the same way, the low-frequency word T can be obtainedunk2Fused vector representation V2
Figure GDA00024455558900001213
After the three steps, the low-frequency word translation method fusing the vectors at the two ends is realized.
In the specific implementation process of fusing the low-frequency word vectors at two ends, the invention mainly comprises the following innovations and advantages in three aspects:
firstly, in the aspect of low-frequency word vector sources to be fused, vector representation of low-frequency word meaning information in two vector spaces of a source language and a target language is utilized simultaneously.
Secondly, in the aspect of multi-subword vector coding of low-frequency words, a long-time memory (LSTM) network is adopted to reversely scan a sequence of subwords, so that the first subword occupies important information in the final vector representation.
Thirdly, in the aspect of low-frequency word semantic fusion, a hyperbolic tangent function tanh is adopted as an Activation function (Activation functions) to fuse the three-part vectors, which is equivalent to performing nonlinear transformation after weighting and summing the three-part vectors.
The innovations and advantages of the three aspects described above are set forth in detail below. First, in one aspect, vector representation of low-frequency semantic information in two vector spaces of source language and target language is utilized simultaneously. Because the low-frequency words only appear less frequently on a certain monolingual, and the other end of bilingual translation does not necessarily meet the characteristic of low frequency, the method ensures that the model can complementarily extract the semantic features of the low-frequency words in two different languages in the training process, and effectively relieves the low-frequency defect. On the other hand, because the vector representation of the target low-frequency words is given in the source language sentence coding process, a copying mechanism from the source language sentences to the target language sentences can be learned in the model training process. Specifically, the source end is fused into the target low-frequency word T in the coding processunk1(y2,y3,y4) And Tunk2(y7,y8,y9) The vector of (1) characterizes Tunk1(v2,v3,v4) And Tunk2(v7,v8,v9) And the output target translation also has Tunk1(y2,y3,y4) And Tunk2(y7,y8,y9) The method realizes a direct prompt, namely a mechanism for directly outputting the target low-frequency words of the source prompt and other contexts to a prediction translation through a training process by learning the neural network translation model.
Second, a long-short-time memory (LSTM) network is used to scan the sequence of subwords in reverse, so that the first subword occupies important information in the final vector representation. The LSTM network is adept at modeling natural language in machine translation, converting sentences of any length into floating point vectors of a particular dimension, remembering important words in the sentences, and retaining memory for a long time. The LSTM is a special structure type of the RNN model, three control units of an input gate, an output gate and a forgetting gate are added, for information entering the LSTM, the three gates can determine the proportion of information to be memorized, forgotten and output after judging the information, and the problem of long-distance dependence in a neural network can be effectively solved.
Figure GDA0002445555890000131
The invention adopts the LSTM network to reversely scan the low-frequency word sub-word sequence, can ensure that the first sub-word plays the greatest role in vector representation, thereby ensuring that the connection of the low-frequency word and the context is more smooth. As shown in the above figure, with the low frequency word Sunk2(x6,x7,x8) The process of LSTM encoding a sequence of low-frequency word sub-words is illustrated for purposes of example. Similar to sentence coding, the vector representations of all sub-words are first taken out separately
Figure GDA0002445555890000132
Then one by one in the right-to-left direction
Figure GDA0002445555890000133
Inputting into LSTM network for encoding. The calculation formula is as follows:
Figure GDA0002445555890000134
finally, the output vector of the LSTM network
Figure GDA0002445555890000135
I.e. the low frequency word Sunk2(x6,x7,x8) Is characterized by a vector of (a), and x6As the last input will be
Figure GDA0002445555890000136
Plays a greater role in the process.
Thirdly, a hyperbolic tangent function tanh is adopted as an Activation function (Activation functions) to fuse the three-part vectors, which is equivalent to performing nonlinear transformation again after weighting and summing the three-part vectors. With low-frequency words Sunk1(x2,x3) For example, the semantic fusion calculation formula is as follows:
Figure GDA0002445555890000137
wherein the "weighted sum" operationEnabling the model to automatically learn the selection weight W of the three-part vector in each group of samples according to the specific context semantic environmentu,unk1、Ws,unk1And Wt,unk1That is to say the weight matrix W of the three-part vectoru,unk1、Ws,unk1And Wt,unk1Belongs to the adaptive parameters of model training. The larger the weight, the more influential the set of vectors will play in the encoder output. The function of the nonlinear transformation is mainly realized by an activation function, and the method has very important functions for learning an artificial neural network model and expressing a complex and nonlinear mapping relation. The invention adopts a hyperbolic tangent function tanh as an activation function in a semantic fusion module in a neural network translation model, and the calculation formula is as follows:
Figure GDA0002445555890000141
as shown in fig. 4, the hyperbolic tangent function tanh adopted in the present invention has the following two advantages:
first, when the value of input x is large or small, the slope of tanh will approach zero indefinitely. That is to say if and only if the result of the weighted summation of the three partial vectors
Figure GDA0002445555890000142
Within a certain range, a larger gradient is generated after nonlinear transformation is performed on the tanh function. On the contrary, when the result of the weighted summation of the three-part vectors exceeds a certain range, the tanh function does not respond much because it approaches saturation. At this time, the function of the tanh function is to shield abnormal values after the weighted summation of the three vectors, and avoid that a certain bias influence is caused on the output of the coding result and the updating of the context vector because the vector summation result is too large or too small.
Secondly, as can be seen from the image of the tanh function, the function has the characteristics of smooth output, easy derivation, output centered at 0 and between (-1, 1), and the like. The characteristics ensure that the parameter updating efficiency of the tanh function is high, and the weighted sum result of the three parts of vectors can be rescaled to a certain size range after nonlinear transformation, which is particularly important in the scenes with dense matrix operations such as neural networks.
In the neural machine translation, an intermediate vector obtained by encoding a source language sentence by an Encoder module directly determines the quality of a target translation generated by an Attention module and a Decoder module. Therefore, the semantic fusion method can ensure that the semantic information of the low-frequency words is fused on the premise of keeping the semantic information of the wildcard characters of the original low-frequency words. At this time, the final vector representation of the low-frequency words not only carries context information in the sentence, but also fully considers the semantic information representation of the low-frequency words.
The invention provides a low-frequency word processing method based on semantic fusion. The method can fuse the vector representation of the low-frequency words and the wildcards in an Encoder module translated by an end-to-end neural machine, so that more accurate and sufficient information is coded into an intermediate vector to promote an Attenttion module and a Decoder module to generate a translation with higher quality. The method provided by the invention can be used for solving the problem of low-frequency word translation and is also beneficial to improving the translation performance of common named entities, complex noun phrases and professional terms in natural language sentences.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and all simple modifications, changes and equivalent structural changes made to the above embodiment according to the technical spirit of the present invention are within the protection scope of the technical solution of the present invention.

Claims (4)

1. A low-frequency word translation method based on semantic information fusion is characterized by comprising the following steps: inputting a bilingual sentence pair in a translation system, wherein a source language sentence x and a target language sentence y corresponding to the source language sentence obtain a sub-word sequence of low-frequency words in the source language sentence, obtain a target translation of the low-frequency words in the target language sentence, and obtain a new bilingual sentence pair after replacing the low-frequency words in the bilingual sentence pair (x, y) by a wildcard character UNKi
Figure FDA0002813577090000011
Fusing vector representations of source language low-frequency words and/or target language low-frequency words with vector representations of wildcard characters UNKi;
the translation method for fusing the vector representation of the source language low-frequency word and the vector representation of the wildcard UNKi comprises the following steps:
step one, taking out the replaced source language sentence
Figure FDA0002813577090000012
Vector characterization of all words in
Figure FDA0002813577090000013
Step two, taking out the vector representation of the low-frequency words in the source language sentence, and coding the vector representation by adopting the LSTM to obtain the vector representation
Figure FDA0002813577090000014
Step three, weighted summation and nonlinear transformation are respectively carried out on the wildcard vector representation of the low-frequency words and the source language sub-word sequence vector representation, and the final vector representation V of the low-frequency words is obtainedi
Figure FDA0002813577090000015
Wherein tanh is a hyperbolic tangent function,
Figure FDA0002813577090000016
is a vector characterization of wildcards, Wu,unkiIs LSTM encoder based on specific context semantic environment pairs
Figure FDA0002813577090000017
Weight of (1), Ws,unkiIs LSTM encoder based on specific context semantic environment pairs
Figure FDA0002813577090000018
The weight of (c);
the translation method for fusing the vector representation of the low-frequency word of the target language and the vector representation of the wildcard UNKi comprises the following steps:
step one, taking out the replaced source language sentence
Figure FDA0002813577090000019
Vector characterization of all words in
Figure FDA00028135770900000110
And step two, taking out the vector representation of the low-frequency words in the target language sentence, and coding the vector representation by adopting LSTM to obtain the vector representationv unkj
Step three, weighted summation and nonlinear transformation are respectively carried out on the wildcard vector representation of the low-frequency words and the target language sub-word sequence vector representation, and the final vector representation V of the low-frequency words is obtainedj
Figure FDA00028135770900000111
Wherein tanh is a hyperbolic tangent function,
Figure FDA00028135770900000112
is a vector characterization of wildcards, Wu,unkjIs LSTM encoder based on specific context semantic environment pairs
Figure FDA00028135770900000113
Weight of (1), Wt,unkjIs LSTM encoder based on specific context semantic environment pairsv unkjThe weight of (c);
the translation method for fusing the vector representation of the source language low-frequency words and the target language low-frequency words with the vector representation of the wildcard UNKi comprises the following steps:
step one, taking out the replaced source language sentence
Figure FDA0002813577090000021
Vector characterization of all words in
Figure FDA0002813577090000022
Step two, vector representations of low-frequency words in source language sentences and target language sentences are taken out and are respectively encoded by adopting LSTM to obtain vector representations
Figure FDA0002813577090000023
Andv unkm
step three, weighted summation and nonlinear transformation are respectively carried out on the wildcard vector representation of the low-frequency words, the source language sub-word sequence vector representation and the target language sub-word sequence vector representation to obtain the final vector representation V of the low-frequency wordsm
Figure FDA0002813577090000024
Wherein tanh is a hyperbolic tangent function,
Figure FDA0002813577090000025
is a vector characterization of wildcards, Wu,unkmIs LSTM encoder based on specific context semantic environment pairs
Figure FDA0002813577090000026
Weight of (1), Ws,unkmIs LSTM encoder based on specific context semantic environment pairs
Figure FDA0002813577090000027
Weight of (1), Wt,unkmIs LSTM encoder based on specific context semantic environment pairsv unkmThe weight of (c);
replaced source language sentence
Figure FDA0002813577090000028
Vector characterization of all words in
Figure FDA0002813577090000029
And initializing randomly according to a certain distribution.
2. The method for translating low-frequency words according to claim 1, wherein: and obtaining a sub-word sequence of the low-frequency words to be processed in the source language sentence through word frequency statistics.
3. The method for translating low-frequency words according to claim 1, wherein: and obtaining a target translation corresponding to the low-frequency words in the target language sentences through searching a dictionary or model translation.
4. The method for translating low-frequency words according to claim 3, wherein: firstly, translating the low-frequency words by using a dictionary searching method, and if the searching fails, translating by using a word model.
CN202010060672.7A 2020-01-19 2020-01-19 Semantic information fusion-based low-frequency word translation method Active CN111274826B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010060672.7A CN111274826B (en) 2020-01-19 2020-01-19 Semantic information fusion-based low-frequency word translation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010060672.7A CN111274826B (en) 2020-01-19 2020-01-19 Semantic information fusion-based low-frequency word translation method

Publications (2)

Publication Number Publication Date
CN111274826A CN111274826A (en) 2020-06-12
CN111274826B true CN111274826B (en) 2021-02-05

Family

ID=71000709

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010060672.7A Active CN111274826B (en) 2020-01-19 2020-01-19 Semantic information fusion-based low-frequency word translation method

Country Status (1)

Country Link
CN (1) CN111274826B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177414A (en) * 2021-04-27 2021-07-27 桂林电子科技大学 Semantic feature processing method and device and storage medium
CN113378567B (en) * 2021-07-05 2022-05-10 广东工业大学 Chinese short text classification method for improving low-frequency words

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982329A (en) * 2012-11-02 2013-03-20 华南理工大学 Segmentation recognition and semantic analysis integration translation method for mobile devices
US8990066B2 (en) * 2012-01-31 2015-03-24 Microsoft Corporation Resolving out-of-vocabulary words during machine translation
CN107329960A (en) * 2017-06-29 2017-11-07 哈尔滨工业大学 Unregistered word translating equipment and method in a kind of neural network machine translation of context-sensitive
CN108345590A (en) * 2017-12-28 2018-07-31 北京搜狗科技发展有限公司 A kind of interpretation method, device, electronic equipment and storage medium
CN110457715A (en) * 2019-07-15 2019-11-15 昆明理工大学 Incorporate the outer word treatment method of the more neural machine translation set of the Chinese of classified dictionary
CN111428518A (en) * 2019-01-09 2020-07-17 科大讯飞股份有限公司 Low-frequency word translation method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109359304B (en) * 2018-08-22 2023-04-18 新译信息科技(深圳)有限公司 Restrictive neural network machine translation method and storage medium
CN109902314B (en) * 2019-04-18 2023-11-24 中译语通科技股份有限公司 Term translation method and device
CN110334362B (en) * 2019-07-12 2023-04-07 北京百奥知信息科技有限公司 Method for solving and generating untranslated words based on medical neural machine translation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8990066B2 (en) * 2012-01-31 2015-03-24 Microsoft Corporation Resolving out-of-vocabulary words during machine translation
CN102982329A (en) * 2012-11-02 2013-03-20 华南理工大学 Segmentation recognition and semantic analysis integration translation method for mobile devices
CN107329960A (en) * 2017-06-29 2017-11-07 哈尔滨工业大学 Unregistered word translating equipment and method in a kind of neural network machine translation of context-sensitive
CN108345590A (en) * 2017-12-28 2018-07-31 北京搜狗科技发展有限公司 A kind of interpretation method, device, electronic equipment and storage medium
CN111428518A (en) * 2019-01-09 2020-07-17 科大讯飞股份有限公司 Low-frequency word translation method and device
CN110457715A (en) * 2019-07-15 2019-11-15 昆明理工大学 Incorporate the outer word treatment method of the more neural machine translation set of the Chinese of classified dictionary

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Chinese-English OOV Term Translation with Web Mining, Multiple Feature Fusion and Supervised Learning;Yun Zhao et al;《Chinese Computational Linguistics and Natural Language Processing Based on Naturally》;20141231;234-246 *
Using Sublexical Translations to Handle the OOV Problem in Machine Translation;Chungchi Huang et al;《ACM Transactions on Asian Language Information Processing》;20110930;1-16 *
中日双语平行语料库的自动构建技术研究;尹存燕;《中国博士学位论文全文数据库信息科技辑》;20160415(第4期);I138-91 *
融合语义概念的神经机器翻译集外词处理方法研究;李少童;《中国优秀硕士学位论文全文数据库信息科技辑》;20180615(第6期);I138-2159 *
面向汉英专利文献的神经网络翻译模型的集外词翻译研究;郑晓康;《中国优秀硕士学位论文全文数据库信息科技辑》;20180115(第1期);I138-2157 *

Also Published As

Publication number Publication date
CN111274826A (en) 2020-06-12

Similar Documents

Publication Publication Date Title
Chollampatt et al. A multilayer convolutional encoder-decoder neural network for grammatical error correction
Zhu et al. CAN-NER: Convolutional attention network for Chinese named entity recognition
CN109086267B (en) Chinese word segmentation method based on deep learning
Zhang et al. Understanding subtitles by character-level sequence-to-sequence learning
Gulcehre et al. On using monolingual corpora in neural machine translation
CN107967262A (en) A kind of neutral net covers Chinese machine translation method
TW201918913A (en) Machine processing and text correction method and device, computing equipment and storage media
CN114757182A (en) BERT short text sentiment analysis method for improving training mode
CN111767718B (en) Chinese grammar error correction method based on weakened grammar error feature representation
CN112115687B (en) Method for generating problem by combining triplet and entity type in knowledge base
CN108415906B (en) Automatic identification discourse machine translation method and machine translation system based on field
CN112183094B (en) Chinese grammar debugging method and system based on multiple text features
CN109522403A (en) A kind of summary texts generation method based on fusion coding
CN110532555B (en) Language evaluation generation method based on reinforcement learning
CN111144410B (en) Cross-modal image semantic extraction method, system, equipment and medium
CN111274826B (en) Semantic information fusion-based low-frequency word translation method
CN115545041B (en) Model construction method and system for enhancing semantic vector representation of medical statement
CN111428518B (en) Low-frequency word translation method and device
CN110084297A (en) A kind of image semanteme alignment structures towards small sample
CN114064856A (en) XLNET-BiGRU-based text error correction method
CN111274827B (en) Suffix translation method based on multi-target learning of word bag
CN114595700A (en) Zero-pronoun and chapter information fused Hanyue neural machine translation method
CN112380882B (en) Mongolian Chinese neural machine translation method with error correction function
CN112507717A (en) Medical field entity classification method fusing entity keyword features
CN116932736A (en) Patent recommendation method based on combination of user requirements and inverted list

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant