CN112541364A - Chinese-transcendental neural machine translation method fusing multilevel language feature knowledge - Google Patents

Chinese-transcendental neural machine translation method fusing multilevel language feature knowledge Download PDF

Info

Publication number
CN112541364A
CN112541364A CN202011409192.3A CN202011409192A CN112541364A CN 112541364 A CN112541364 A CN 112541364A CN 202011409192 A CN202011409192 A CN 202011409192A CN 112541364 A CN112541364 A CN 112541364A
Authority
CN
China
Prior art keywords
phrase
word
tree
chinese
machine translation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011409192.3A
Other languages
Chinese (zh)
Inventor
余正涛
邹翔
赖华
徐毓
文永华
朱俊国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CN202011409192.3A priority Critical patent/CN112541364A/en
Publication of CN112541364A publication Critical patent/CN112541364A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a method for Chinese-transcendental neural machine translation fusing multilevel language characteristic knowledge, which fuses and analyzes the language characteristic knowledge of three different levels of characters, words and phrases respectively. Secondly, by a method of constructing a phrase tree encoder on the basis of a standard sequence encoder, phrase information in a sentence is further merged into a sequence conversion process of Chinese-crossing neural machine translation. Experimental results show that the fusion method can effectively utilize language feature knowledge of different levels to make up the problem of insufficient resources of the Hanyue language, and the performance of the Hanyue translation model is improved to a certain extent.

Description

Chinese-transcendental neural machine translation method fusing multilevel language feature knowledge
Technical Field
The invention relates to a Chinese-transcendental neural machine translation method fusing multilevel language feature knowledge, belonging to the technical field of natural language processing.
Background
The Chinese-Vietnamese is a typical low-resource language pair, the available resources are less, and the problem of insufficient resources needs to be solved by utilizing language feature knowledge of different levels. The Vietnamese has rich morphological change and various grammatical structures, and the invention aims to fully mine and utilize language feature knowledge of different levels so as to solve the difficult problem of resource scarcity faced by the Chinese-crossing neural machine translation.
The language feature knowledge of different levels refers to semantic information contained in sequence structures of different levels such as characters, words, phrases and the like. Most of the existing neural machine translation is based on words, but large-scale linguistic data is needed for training word vectors, and the problem that words are not registered easily occurs in the translation process. Therefore, researchers have considered making full use of information in words starting from smaller granularity. In consideration of different morphological changes and various grammatical structures of Vietnamese, the invention uses three layers of characters, words and phrases as multi-layer representation of a language symbol sequence, wherein the character sequence can effectively represent the different morphological changes of the Vietnamese, any Vietnamese word is formed by combining the character sequence, and meanwhile, the character sequence can effectively represent information contained in the word, thereby relieving the problem of rare words which are easy to generate under small-scale linguistic data to a certain extent; the word sequence can visually depict semantic information contained in a source language, accords with an expression mode of human habit, and is a translation unit which is used by machine translation at the earliest and has the fastest effect; the phrase sequence contains certain word sequence and syntactic structure information, and is beneficial to improving the remote dependence problem in the Chinese-crossing neural machine translation. Therefore, the invention provides a Chinese-crossing neural machine translation method fusing multi-level language feature knowledge.
Disclosure of Invention
The invention provides a Chinese-Vietnamese neural machine translation method fusing multi-level language feature knowledge, which effectively utilizes the language feature knowledge of different levels to make up the problem of insufficient resources of the Chinese-Vietnamese language and improves the performance of a Chinese-Vietnamese translation model to a certain extent.
The invention provides a Chinese-crossing neural machine translation method fusing multi-level language feature knowledge (characters, words and phrases). The recognition method respectively fuses and analyzes the language feature knowledge of three different levels of characters (Character), words (Word) and phrases (Phrase), in order to effectively utilize the language feature knowledge of the different levels, the invention firstly obtains Word vector representation based on characters through bidirectional LSTM, then combines the Word vector representation based on characters with pre-trained Word vectors, and enables a model to dynamically select the Word vectors and Character information through an attention mechanism. Secondly, by a method of constructing a phrase tree encoder on the basis of a standard sequence encoder, phrase information in a sentence is further merged into a sequence conversion process of Chinese-crossing neural machine translation. The experimental result shows that the fusion method is an optimal technical scheme obtained in the experimental process, can effectively utilize language feature knowledge of different levels to make up the problem of insufficient resources of the Hanyue language, and improves the performance of the Hanyue translation model to a certain extent.
The technical scheme of the invention is as follows: a method for Chinese-Yuan neural machine translation fusing multi-level language feature knowledge comprises the following specific steps:
step1, corpus collection and pretreatment: collecting Chinese and Vietnamese parallel data, and respectively preprocessing the data by using a preprocessing tool which accords with Chinese and Vietnamese characteristics;
step2, obtaining vectors of characters in words by using a bidirectional LSTM on the basis of Step1, and combining word vectors obtained by character training calculation with word vectors pre-trained to obtain word vectors fusing character features;
step3, deep semantic feature fusion: in the phrase structure grammar driven by the central language, a sentence is composed of a plurality of phrase units and is expressed in a binary tree form, and an encoder based on a phrase tree is constructed on a standard sequence encoder according to the sentence structure, and phrase feature knowledge is further merged into the standard sequence encoder based on words;
the realization of Chinese language neural machine translation fusing different levels of language feature knowledge is completed.
Further, the Step1 includes the specific steps of:
step1.1, obtaining 140K Hanyue parallel sentence pairs in a network crawling and manual collecting mode, wherein 2K parallel sentence pairs are collected in a test set, and 2K parallel sentence pairs are verified by a verifier;
step1.2, Chinese data are participated by using a JIEBA participator, and the StanfordNLP toolkit of Stanford university is used for phrase data analysis; and the Vietnamese data is subjected to phrase data method analysis by adopting a Vietnamese phrase syntax analysis tool so as to obtain a Vietnamese phrase tree.
Further, the Step2 includes the specific steps of:
step2.1, in the neural machine translation, a natural language is required to be characterized into a characteristic vector form to be used as the input of a model, and the semantic vector representation of a word is obtained through the information calculation of characters in the word;
step2.2, combining the word vector obtained by character training calculation with the word vector pre-trained by using a weight weighting method to obtain the optimal representation of a semantic unit;
step2.3, the common words have high-quality word vector representation, the character representation is aligned with the word vector through optimizing the vector, and finally the word vector with character characteristics fused is obtained through training.
Further, the Step3 includes the specific steps of:
step3.1, in the phrase structure grammar driven by the central language, a sentence is composed of a plurality of phrases and is expressed in a binary tree form, wherein each node in the binary tree is expressed by an LSTM unit, and a sentence vector is constructed by phrase vectors in a bottom-up mode;
step3.2, when computing LSTM units of leaf nodes, allowing the model to compute different representation forms of the same word appearing in sentences for multiple times; the model now has two different sentence vectors: providing another Tree-LSTM unit, one from the sequence encoder and the other from the phrase Tree based encoder, using the Tree-LSTM unit to initialize the decoder unit with the final sequence encoder unit and the phrase Tree based encoder unit as two subunits;
step3.3, an attention mechanism is introduced into a phrase tree-sequence model, so that the model focuses on not only a sequence hiding unit but also a phrase hiding unit, when the model decodes a target word, which words or phrases in an original sentence are important can be known, and phrase feature knowledge is further merged into the word basis.
The invention has the beneficial effects that: the invention introduces semantic information contained in different symbol sequences into the neural machine translation process through the fusion representation of three different levels of linguistic feature knowledge including characters, words and phrases. Experimental results show that the method effectively utilizes the language feature knowledge of different levels, and improves the translation performance of the Hanyue neural machine to a certain extent.
Drawings
FIG. 1 is a flow chart of neural machine translation incorporating knowledge of multi-level linguistic features in accordance with the present invention;
FIG. 2 is a schematic diagram of shallow semantic feature fusion in accordance with the present invention;
FIG. 3 is an exemplary diagram of a Vietnamese phrase structure tree according to the present invention;
FIG. 4 is a diagram of the phrase tree-sequence based neural machine translation of the present invention.
Detailed Description
Example 1: as shown in fig. 1-4, a method for machine translation of hanyue nerves fusing multi-level language feature knowledge, the method comprises the following specific steps:
step1, corpus collection and pretreatment: collecting Chinese and Vietnamese parallel data, and respectively preprocessing the data by using a preprocessing tool which accords with Chinese and Vietnamese characteristics;
step1.1, obtaining 140K Hanyue parallel sentence pairs in a network crawling and manual collecting mode, wherein 2K parallel sentence pairs are collected in a test set, and 2K parallel sentence pairs are verified by a verifier;
step1.2, in the pretreatment of experimental data, dividing Chinese data into words by using a JIEBA word dividing tool, and performing phrase data analysis by using a StanfordNLP tool kit of Stanford university; and the Vietnamese data is subjected to phrase data method analysis by adopting a Vietnamese phrase syntax analysis tool so as to obtain a Vietnamese phrase tree. Because less syntax parsing open source tools are used for Vietnamese, a Vietnamese phrase syntax parsing tool developed by Liying and the like is adopted to carry out phrase syntax parsing on the Vietnamese to obtain a Vietnamese phrase tree. The experimental data used are shown in table 1.
Table 1 experimental data set-up
Figure BDA0002814638810000041
Step2, obtaining vectors of characters in words by using a bidirectional LSTM on the basis of Step1, and combining word vectors obtained by character training calculation with word vectors pre-trained to obtain word vectors fusing character features;
step2.1, in neural machine translation, requires the characterization of natural language in the form of feature vectors as input to the model. Therefore, we first consider how to compute a semantic vector representation of a word from information of characters within the word. As shown in fig. 2, the words in a sentence are decomposed into characters resulting in an embedded sequence of characters (c 1.., cR) passed through the bi-directional LSTM:
Figure BDA0002814638810000042
the last hidden vector of each LSTM is concatenated as a character representation of a single word, then passed through a separate non-linear layer:
h*=[hR;h1] m=tanh(Wmh) (2)
wherein WmIs oneThe weight matrix is used for mapping the hidden vectors connected in the two LSTMs to a combined word vector constructed by a single character to represent m;
in the shallow semantic feature fusion of characters and words, word2vec pre-training word vectors are used, the word embedding dimension is 256-dimensional, and words which appear only once in training data are replaced by universal OOV marks and are still used in character components. All the numbers in the corpus are replaced by the character "0", the embedding length of the character is set to 50, and random initialization is performed. The LSTM layer size for each direction is set to 200, the combination indicates that m has the same dimension as word embedding, the default learning rate is 1.0, and the batch size is 64.
Step2.2, combining the word vectors calculated by the character training and the pre-trained word vectors to obtain an optimal representation of a semantic unit by using a weight weighting method. Equation (3) (4) can learn the same semantic features for each word for word embedding and character assembly:
Figure BDA0002814638810000043
Figure BDA0002814638810000044
wherein
Figure BDA0002814638810000045
And
Figure BDA0002814638810000046
is a weight matrix for z, σ () means a value of [0, 1]]Logistic function within the range. The vectors z and x, m have the same dimensions, which allows the model to dynamically decide how much information to use from word embedding or character components;
step2.3, otherwise common words themselves have high quality word vector representations, so we pair a character representation with a word vector by optimizing the vector m:
Figure BDA0002814638810000051
Figure BDA0002814638810000052
wherein m is(t)Is a representation that is dynamically built from individual characters in the input tth word. The character component learns from word embedding that the embedding of rare words should be excluded, so for OOV we set this to 0 using gt. Finally training to obtain word vector with character characteristics fused
Figure BDA00028146388100000514
Step3, deep semantic feature fusion: in the phrase structure grammar driven by the central language, a sentence is composed of a plurality of phrase units and is expressed in a binary tree form, and an encoder based on a phrase tree is constructed on a standard sequence encoder according to the sentence structure, and phrase feature knowledge is further merged into the standard sequence encoder based on words; in the Central language driven phrase structure grammar, a sentence is composed of a plurality of phrase units and represented as a binary tree, as shown in FIG. 3.
Step3.1, based on the sentence structure, we constructed a phrase tree based encoder based on the standard sequence encoder, as shown in FIG. 4. Where each node in the binary tree is represented by an LSTM unit and a sentence vector is constructed from the phrase vector in a bottom-up manner. Kth parent hidden unit
Figure BDA0002814638810000053
Using left and right hidden units
Figure BDA0002814638810000054
And
Figure BDA0002814638810000055
and calculating to obtain:
Figure BDA0002814638810000056
wherein f istreeIs a non-linear function. Each non-leaf node is also represented by an LSTM element and computes an LSTM element for the parent node having two child LSTM elements. Hidden unit of kth father node
Figure BDA0002814638810000057
And a storage unit
Figure BDA0002814638810000058
Is calculated as follows:
Figure BDA0002814638810000059
Figure BDA00028146388100000510
Figure BDA00028146388100000511
Figure BDA00028146388100000512
Figure BDA00028146388100000513
Figure BDA0002814638810000061
Figure BDA0002814638810000062
wherein ik
Figure BDA0002814638810000063
ok
Figure BDA0002814638810000064
The input gate, the left and right forgetting gates, the output gate and the state for updating the storage unit are respectively.
Figure BDA0002814638810000065
And
Figure BDA0002814638810000066
memory cells are left and right cells, respectively.
Figure BDA0002814638810000067
Are all a matrix of weights, and are,
Figure BDA0002814638810000068
is a vector of the offset to the offset,
Figure BDA0002814638810000069
representing a dot product operation;
step3.2, in computing LSTM units for leaf nodes, we allow the model to compute different representations of the same word that occur multiple times in a sentence. The model now has two different sentence vectors: one from the sequence encoder and the other from the phrase Tree based encoder, we provide another Tree-LSTM unit that will end up with sequence encoder unit hnAnd a phrase tree based encoder unit
Figure BDA00028146388100000610
As two sub-units for initializing the decoder unit s1
Figure BDA00028146388100000611
Wherein the function gtreeAnd function ftreeWith the same functionality, this initialization allows the decoder to select from sequence data and phrasesCapturing information in the structure, by setting when the parser cannot output parse tree of sentences
Figure BDA00028146388100000612
A sentence is encoded with a sequence encoder, so that any sentence can be processed by a phrase tree-sequence based encoder;
step3.3, an attention mechanism is introduced into a phrase tree-sequence model, so that the model focuses on not only a sequence hiding unit but also a phrase hiding unit, when the model decodes a target word, which words or phrases in an original sentence are important can be known, and phrase feature knowledge is further merged into the word basis.
J context vector is composed of sequential vector djAnd phrase vector score by attention αj(i) Weighting:
Figure BDA00028146388100000613
if the binary tree has n leaves, the binary tree has n-1 phrase nodes, and a final decoder is set
Figure BDA00028146388100000614
In addition, the invention adopts an input-feeding method for hiding the unit s from the current targetjIn the previous hidden unit to predict the word yj-1
Figure BDA00028146388100000615
Wherein
Figure BDA00028146388100000616
Is sj-1And
Figure BDA00028146388100000617
is cascaded. The input-feeding method is beneficial to enriching the calculation of the decoder, and experiments show that the input-feeding method is improvedThe BLEU score is given.
The realization of Chinese language neural machine translation fusing different levels of language feature knowledge is completed.
In deep semantic feature fusion, hidden unit and word embedding dimensions are 256 dimensions, forgetting gate bias terms of LSTM and Tree-LSTM are initialized with 1.0, and the rest of model parameters are uniformly initialized in [ -0.1, 0.1 ]. The model parameters are optimized by adopting a common SGD, the initial learning rate SGD is 1.0, and the batch size is 128. As the loss becomes more severe, the learning rate is halved. The gradient specification is clipped to 3.0 to avoid the explosive gradient problem. And evaluating the model through the BLEU automatic evaluation index in the experiment.
The influence of multi-level language feature knowledge on the translation performance of the Hanyu neural machine is researched, and a model (LSTM + C) only using characters, a model (LSTM + W) only using words, a model (Tree-LSTM) only using a phrase Tree, a model (LSTM + C + W) only performing character and word fusion, a model (Tree-LSTM + C) only performing character and phrase Tree fusion, a model (Tree-LSTM + W) only performing word and phrase Tree fusion and a model (Tree-LSTM + C + W) provided by the invention are compared in an experiment. The results of the experiment are shown in table 2.
TABLE 2 influence of fusing different levels of linguistic feature knowledge on BLEU value
Figure BDA0002814638810000071
By comparing the experimental results in Table 2, the model with only character and word fusion (LSTM + C + W), the model with only character and phrase fusion (Tree-LSTM + C) and the model with only word and phrase fusion (Tree-LSTM + W) have higher BLEU values than the three models without feature fusion. Compared with a model only carrying out character and word fusion (LSTM + C + W), the model (Tree-LSTM + C + W) of the invention improves the BLEU value by 0.95 percent; compared with a model only carrying out character and phrase fusion (Tree-LSTM + C), the BLEU value is improved by 0.69 percent; the BLEU value was improved by 0.58 percent compared to the model with only word and phrase fusion (Tree-LSTM + W). The invention effectively introduces the language feature knowledge of different levels into the neural machine translation by deeply mining and utilizing characters, words and phrases, thereby improving the performance of the Chinese-crossing neural machine translation to a certain extent.
Also, as can be seen from table 2, the model using only characters (LSTM + C) performed poorly, with a 0.68 percent reduction in the BLEU value compared to the model using only words (LSTM + W) and a 1.24 percent reduction in the BLEU value compared to the model using only phrase trees (Tree-LSTM). The analysis reason is that although the problem of data sparsity is reduced to a certain extent by using the characters independently, the sentence length is greatly increased, and the difficulty of long-distance dependence learning is increased. Therefore, the model which takes the characters as the operation objects is less competitive, which is also the reason why the invention does not use the character embedding to completely replace the word embedding in the shallow semantic feature fusion stage, but combines the two, thereby allowing the model to fully utilize the information at two granularity levels.
On the basis, in order to visually observe and compare the translation effects of different models, translation quality comparison analysis is carried out on the translation results of 4 language feature knowledge models with different levels of fusion. The results of the experiment are shown in table 3.
TABLE 3 comparison of translation quality
Figure BDA0002814638810000081
As can be seen from Table 3, the translation quality of the model (Tree-LSTM + C + W) of the present invention is higher, i.e. the correct translation of the word "swimming" is
Figure BDA00028146388100000811
While others translate it into
Figure BDA00028146388100000812
And is not entirely accurate. "No matter how harsh the environment is" reference translation is "
Figure BDA0002814638810000082
(either)
Figure BDA0002814638810000083
(Environment)
Figure BDA0002814638810000084
(harsh)
Figure BDA0002814638810000085
Figure BDA0002814638810000086
(what), and a translation of the (LSTM + C + W) model is "
Figure BDA0002814638810000087
(either)
Figure BDA0002814638810000088
(Environment)
Figure BDA0002814638810000089
(how)
Figure BDA00028146388100000810
(rigorous) ", there is a significant syntactic error due to the absence of phrase syntax information by the sequential sequence model. In the model of the invention, the character characteristics are blended to ensure that the expression quality of word vectors is high, meanwhile, the phrase tree reserves the syntactic information among texts, and the performance of Chinese-transcendental neural machine translation is improved by fully utilizing multi-level language characteristic knowledge.
While the present invention has been described in detail with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.

Claims (4)

1. A method for Chinese-crossing neural machine translation fusing multi-level language feature knowledge is characterized by comprising the following steps:
the method comprises the following specific steps:
step1, corpus collection and pretreatment: collecting Chinese and Vietnamese parallel data, and respectively preprocessing the data by using a preprocessing tool which accords with Chinese and Vietnamese characteristics;
step2, obtaining vectors of characters in words by using a bidirectional LSTM on the basis of Step1, and combining word vectors obtained by character training calculation with word vectors pre-trained to obtain word vectors fusing character features;
step3, deep semantic feature fusion: in the phrase structure grammar driven by the central language, a sentence is composed of a plurality of phrase units and is expressed in a binary tree form, and an encoder based on a phrase tree is constructed on a standard sequence encoder according to the sentence structure, and phrase feature knowledge is further merged into the standard sequence encoder based on words;
the realization of Chinese language neural machine translation fusing different levels of language feature knowledge is completed.
2. The method of machine translation of hanyu fusing multi-level linguistic feature knowledge of claim 1, wherein: the specific steps of Step1 are as follows:
step1.1, obtaining 140K Hanyue parallel sentence pairs in a network crawling and manual collecting mode, wherein 2K parallel sentence pairs are collected in a test set, and 2K parallel sentence pairs are verified by a verifier;
step1.2, Chinese data are participated by using a JIEBA participator, and the StanfordNLP toolkit of Stanford university is used for phrase data analysis; and the Vietnamese data is subjected to phrase data method analysis by adopting a Vietnamese phrase syntax analysis tool so as to obtain a Vietnamese phrase tree.
3. The method of machine translation of hanyu fusing multi-level linguistic feature knowledge of claim 1, wherein: the specific steps of Step2 are as follows:
step2.1, in the neural machine translation, a natural language is required to be characterized into a characteristic vector form to be used as the input of a model, and the semantic vector representation of a word is obtained through the information calculation of characters in the word;
step2.2, combining the word vector obtained by character training calculation with the word vector pre-trained by using a weight weighting method to obtain the optimal representation of a semantic unit;
step2.3, the common words have high-quality word vector representation, the character representation is aligned with the word vector through optimizing the vector, and finally the word vector with character characteristics fused is obtained through training.
4. The method of machine translation of hanyu fusing multi-level linguistic feature knowledge of claim 1, wherein: the specific steps of Step3 are as follows:
step3.1, in the phrase structure grammar driven by the central language, a sentence is composed of a plurality of phrases and is expressed in a binary tree form, wherein each node in the binary tree is expressed by an LSTM unit, and a sentence vector is constructed by phrase vectors in a bottom-up mode;
step3.2, when computing LSTM units of leaf nodes, allowing the model to compute different representation forms of the same word appearing in sentences for multiple times; the model now has two different sentence vectors: providing another Tree-LSTM unit, one from the sequence encoder and the other from the phrase Tree based encoder, using the Tree-LSTM unit to initialize the decoder unit with the final sequence encoder unit and the phrase Tree based encoder unit as two subunits;
step3.3, an attention mechanism is introduced into a phrase tree-sequence model, so that the model focuses on not only a sequence hiding unit but also a phrase hiding unit, when the model decodes a target word, which words or phrases in an original sentence are important can be known, and phrase feature knowledge is further merged into the word basis.
CN202011409192.3A 2020-12-03 2020-12-03 Chinese-transcendental neural machine translation method fusing multilevel language feature knowledge Pending CN112541364A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011409192.3A CN112541364A (en) 2020-12-03 2020-12-03 Chinese-transcendental neural machine translation method fusing multilevel language feature knowledge

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011409192.3A CN112541364A (en) 2020-12-03 2020-12-03 Chinese-transcendental neural machine translation method fusing multilevel language feature knowledge

Publications (1)

Publication Number Publication Date
CN112541364A true CN112541364A (en) 2021-03-23

Family

ID=75016025

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011409192.3A Pending CN112541364A (en) 2020-12-03 2020-12-03 Chinese-transcendental neural machine translation method fusing multilevel language feature knowledge

Country Status (1)

Country Link
CN (1) CN112541364A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113065358A (en) * 2021-04-07 2021-07-02 齐鲁工业大学 Text-to-semantic matching method based on multi-granularity alignment for bank consultation service
CN113392656A (en) * 2021-06-18 2021-09-14 电子科技大学 Neural machine translation method fusing push-and-knock network and character coding
CN113609849A (en) * 2021-07-07 2021-11-05 内蒙古工业大学 Mongolian multi-mode fine-grained emotion analysis method fused with priori knowledge model
CN113901840A (en) * 2021-09-15 2022-01-07 昆明理工大学 Text generation evaluation method based on multi-granularity features

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108363695A (en) * 2018-02-23 2018-08-03 西南交通大学 A kind of user comment attribute extraction method based on bidirectional dependency syntax tree characterization
CN108628823A (en) * 2018-03-14 2018-10-09 中山大学 In conjunction with the name entity recognition method of attention mechanism and multitask coordinated training
CN108920565A (en) * 2018-06-21 2018-11-30 苏州大学 A kind of picture header generation method, device and computer readable storage medium
CN109299262A (en) * 2018-10-09 2019-02-01 中山大学 A kind of text implication relation recognition methods for merging more granular informations
CN109858032A (en) * 2019-02-14 2019-06-07 程淑玉 Merge more granularity sentences interaction natural language inference model of Attention mechanism
CN109933795A (en) * 2019-03-19 2019-06-25 上海交通大学 Based on context-emotion term vector text emotion analysis system
CN110362723A (en) * 2019-05-31 2019-10-22 平安国际智慧城市科技股份有限公司 A kind of topic character representation method, apparatus and storage medium
CN110377918A (en) * 2019-07-15 2019-10-25 昆明理工大学 Merge the more neural machine translation method of the Chinese-of syntax analytic tree
CN110442880A (en) * 2019-08-06 2019-11-12 上海海事大学 A kind of interpretation method, device and the storage medium of machine translation translation
CN110728155A (en) * 2019-09-27 2020-01-24 内蒙古工业大学 Tree-to-sequence-based Mongolian Chinese machine translation method
CN110825845A (en) * 2019-10-23 2020-02-21 中南大学 Hierarchical text classification method based on character and self-attention mechanism and Chinese text classification method
CN110879940A (en) * 2019-11-21 2020-03-13 哈尔滨理工大学 Machine translation method and system based on deep neural network
CN111222318A (en) * 2019-11-19 2020-06-02 陈一飞 Trigger word recognition method based on two-channel bidirectional LSTM-CRF network
CN111353306A (en) * 2020-02-22 2020-06-30 杭州电子科技大学 Entity relationship and dependency Tree-LSTM-based combined event extraction method

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108363695A (en) * 2018-02-23 2018-08-03 西南交通大学 A kind of user comment attribute extraction method based on bidirectional dependency syntax tree characterization
CN108628823A (en) * 2018-03-14 2018-10-09 中山大学 In conjunction with the name entity recognition method of attention mechanism and multitask coordinated training
CN108920565A (en) * 2018-06-21 2018-11-30 苏州大学 A kind of picture header generation method, device and computer readable storage medium
CN109299262A (en) * 2018-10-09 2019-02-01 中山大学 A kind of text implication relation recognition methods for merging more granular informations
CN109858032A (en) * 2019-02-14 2019-06-07 程淑玉 Merge more granularity sentences interaction natural language inference model of Attention mechanism
CN109933795A (en) * 2019-03-19 2019-06-25 上海交通大学 Based on context-emotion term vector text emotion analysis system
CN110362723A (en) * 2019-05-31 2019-10-22 平安国际智慧城市科技股份有限公司 A kind of topic character representation method, apparatus and storage medium
CN110377918A (en) * 2019-07-15 2019-10-25 昆明理工大学 Merge the more neural machine translation method of the Chinese-of syntax analytic tree
CN110442880A (en) * 2019-08-06 2019-11-12 上海海事大学 A kind of interpretation method, device and the storage medium of machine translation translation
CN110728155A (en) * 2019-09-27 2020-01-24 内蒙古工业大学 Tree-to-sequence-based Mongolian Chinese machine translation method
CN110825845A (en) * 2019-10-23 2020-02-21 中南大学 Hierarchical text classification method based on character and self-attention mechanism and Chinese text classification method
CN111222318A (en) * 2019-11-19 2020-06-02 陈一飞 Trigger word recognition method based on two-channel bidirectional LSTM-CRF network
CN110879940A (en) * 2019-11-21 2020-03-13 哈尔滨理工大学 Machine translation method and system based on deep neural network
CN111353306A (en) * 2020-02-22 2020-06-30 杭州电子科技大学 Entity relationship and dependency Tree-LSTM-based combined event extraction method

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
AKIKO ERIGUCHI等: "Tree-to-Sequence Attentional Neural Machine Translation", 《ARXIV:1603.06075V3》 *
CHAO SU等: "Neural machine translation with Gumbel Tree-LSTM based encoder", 《JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION》 *
MAHTAB AHMED等: "Improving Tree-LSTM with Tree Attention", 《ARXIV:1901.00066V1》 *
WANJIN CHE等: "Towards Integrated Classification Lexicon for Handling Unknown Words in Chinese-Vietnamese Neural Machine Translation", 《ACM》 *
常宝宝: "基于深度学习的图解码依存分析研究进展", 《山 西大学学报 ( 自然科学版 )》 *
康世泽等: "基于词向量和概念上下文信息的本体对齐方法", 《信息工程大学学报》 *
普浏清等: "基于依存图网络的汉越神经机器翻译方法", 《中文信息学报》 *
王振晗等: "融合句法解析树的汉-越卷积神经机器翻译", 《软件学报》 *
赵亚欧等: "融合基于语言模型的词嵌入和多尺度卷积神经网络的情感分析", 《计算机应用》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113065358A (en) * 2021-04-07 2021-07-02 齐鲁工业大学 Text-to-semantic matching method based on multi-granularity alignment for bank consultation service
CN113065358B (en) * 2021-04-07 2022-05-24 齐鲁工业大学 Text-to-semantic matching method based on multi-granularity alignment for bank consultation service
CN113392656A (en) * 2021-06-18 2021-09-14 电子科技大学 Neural machine translation method fusing push-and-knock network and character coding
CN113609849A (en) * 2021-07-07 2021-11-05 内蒙古工业大学 Mongolian multi-mode fine-grained emotion analysis method fused with priori knowledge model
CN113901840A (en) * 2021-09-15 2022-01-07 昆明理工大学 Text generation evaluation method based on multi-granularity features
CN113901840B (en) * 2021-09-15 2024-04-19 昆明理工大学 Text generation evaluation method based on multi-granularity characteristics

Similar Documents

Publication Publication Date Title
CN110717334B (en) Text emotion analysis method based on BERT model and double-channel attention
CN108460013B (en) Sequence labeling model and method based on fine-grained word representation model
CN108829801B (en) Event trigger word extraction method based on document level attention mechanism
CN111241294B (en) Relationship extraction method of graph convolution network based on dependency analysis and keywords
CN112541364A (en) Chinese-transcendental neural machine translation method fusing multilevel language feature knowledge
CN109086267B (en) Chinese word segmentation method based on deep learning
CN107578106B (en) Neural network natural language reasoning method fusing word semantic knowledge
CN109359294B (en) Ancient Chinese translation method based on neural machine translation
CN109657239A (en) The Chinese name entity recognition method learnt based on attention mechanism and language model
CN108846017A (en) The end-to-end classification method of extensive newsletter archive based on Bi-GRU and word vector
CN110502753A (en) A kind of deep learning sentiment analysis model and its analysis method based on semantically enhancement
CN111160467A (en) Image description method based on conditional random field and internal semantic attention
CN111078866B (en) Chinese text abstract generation method based on sequence-to-sequence model
CN106502985A (en) A kind of neural network modeling approach and device for generating title
CN111125333B (en) Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism
CN115510814B (en) Chapter-level complex problem generation method based on dual planning
CN113254604B (en) Reference specification-based professional text generation method and device
CN111353040A (en) GRU-based attribute level emotion analysis method
CN110083824A (en) A kind of Laotian segmenting method based on Multi-Model Combination neural network
Zhang et al. A BERT fine-tuning model for targeted sentiment analysis of Chinese online course reviews
CN113822054A (en) Chinese grammar error correction method and device based on data enhancement
Ansari et al. Language Identification of Hindi-English tweets using code-mixed BERT
CN114564953A (en) Emotion target extraction model based on multiple word embedding fusion and attention mechanism
CN108763198B (en) Automatic generation method for related work in generative academic paper
CN112507717A (en) Medical field entity classification method fusing entity keyword features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210323