CN110472252B - Method for translating Hanyue neural machine based on transfer learning - Google Patents

Method for translating Hanyue neural machine based on transfer learning Download PDF

Info

Publication number
CN110472252B
CN110472252B CN201910751450.7A CN201910751450A CN110472252B CN 110472252 B CN110472252 B CN 110472252B CN 201910751450 A CN201910751450 A CN 201910751450A CN 110472252 B CN110472252 B CN 110472252B
Authority
CN
China
Prior art keywords
english
chinese
machine translation
neural machine
translation model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910751450.7A
Other languages
Chinese (zh)
Other versions
CN110472252A (en
Inventor
余正涛
黄继豪
郭军军
文永华
高盛祥
王振晗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CN201910751450.7A priority Critical patent/CN110472252B/en
Publication of CN110472252A publication Critical patent/CN110472252A/en
Application granted granted Critical
Publication of CN110472252B publication Critical patent/CN110472252B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a method for translating a Hanyue neural machine based on transfer learning, belonging to the technical field of natural language processing. The invention comprises the following steps: corpus collection and pretreatment: collecting and preprocessing parallel corpora of Chinese-Yue, english-Yue and Chinese-English sentence pairs; generating Chinese-English-more-three-language parallel linguistic data by using Chinese-English-more-parallel linguistic data; training a Chinese-English neural machine translation model and an English-crossing neural machine translation model, and initializing parameters of the Chinese-crossing neural machine translation model by using parameters of a pre-training model; and performing fine tuning training on the initialized Hanyue neural machine translation model by using the Hanyue parallel corpus to obtain the Hanyue neural machine translation model for carrying out Hanyue neural machine translation. The invention can effectively improve the translation performance of the Hanyue neural machine.

Description

Method for translating Hanyue neural machine based on transfer learning
Technical Field
The invention relates to a method for translating a Hanyue neural machine based on transfer learning, belonging to the technical field of natural language processing.
Background
In recent years, communication between two countries is becoming more frequent, and the demand for translation technology in a low-resource scenario such as chinese-vietnamese is increasing. However, the neural machine translation performance of the Chinese-Vietnamese language is not ideal at present, so that the performance of the Chinese-Vietnamese neural machine translation system is improved, and the method plays a very important role in communication between two countries. End-to-end Neural Machine Translation (Neural Machine Translation) is a brand-new Translation system, and the mapping from a source language text to a target language text is realized by directly utilizing a Neural network. Neural machine translation has achieved good translation performance in resource-rich language pairs, and has achieved compelling performance in many translation tasks. However, it is still affected by the scale and quality of parallel corpus on the task of Chinese-crossing neural machine translation, because corpus resources are scarce, and there is no large-scale Chinese-crossing parallel corpus, resulting in poor performance of Chinese-crossing neural machine translation. Therefore, the method has very important application prospect on how to improve the translation effect of the Han-Yuan neural machine;
the current pivot language and transfer learning method is one of effective methods for solving the problem of poor neural machine translation effect in a low-resource scene. The source and target languages are bridged using an axis language. The existing parallel corpora of the source language-pivot language and the pivot language-target language are used for training translation models of the source language to the pivot language and the pivot language to the target language respectively. The method has the advantage that translation between source and target languages is possible even if there is no bilingual corpus available for language pairs in low resource scenarios. In addition, the neural machine translation task essentially requires that the model can get sentences in the target language and does not lose information in source language sentences, and thus is suitable for the field of transfer learning knowledge. Compared with the pivot language method, the source language-target language model parameters can be directly improved by the transfer learning, so that many researchers develop research in the field of transfer learning. The method for transfer learning can train parameters of the language pair model with rich resources to initialize parameters of the translation model in a low-resource scene. However, these training processes lack guidance of small-scale bilingual parallel corpora, resulting in a noisy input of multiple languages. In addition, the above method focuses more on improving the parameters of the model in the low resource scenario, and no improvement is made on a separate encoder or decoder. The Chinese-English neural machine translation is neural machine translation under a low-resource scene, and training linguistic data are scarce, but a large amount of Chinese-English parallel linguistic data exist, so that the method is suitable for transfer learning and pivot language. Therefore, the invention provides a method for translating the Hanyue neural machine based on the transfer learning, which solves the problem that the Hanyue machine translation effect is poor in a low-resource scene.
Disclosure of Invention
The invention provides a method for translating a Hanyue neural machine based on transfer learning, which is used for solving the problem of poor translation effect of the Hanyue neural machine.
The technical scheme of the invention is as follows: the method for translating the Hanyue neural machine based on the transfer learning comprises the following specific steps:
step1, corpus collection and pretreatment: collecting and preprocessing parallel corpora of Chinese-Yue, english-Yue and Chinese-English sentence pairs;
as a preferred embodiment of the present invention, the Step1 specifically comprises the following steps:
step1.1, crawling Chinese-Yue, english-Yue and Chinese-English parallel sentence pairs by using a crawler, and extracting a part of training data to be used as a test set and a verification set;
step1.2, the crawled linguistic data are manually screened, then word segmentation is carried out on the crawled linguistic data, and Arabic numbers are replaced by num and messy code filtering processing, so that the neural machine translation model achieves a better effect.
Step2, generating Chinese-English-more-three-language parallel linguistic data by using the Chinese-English-more-parallel linguistic data and the English-more-parallel linguistic data;
as a preferable scheme of the invention, the Step2 comprises the following specific steps:
step2.1, in the existing data set of Chinese-English and English-Vietnamese, using a retranslation method for the axial language and English, using English-Chinese parallel linguistic data to train an English-Chinese neural machine translation model based on an attention machine system, and then using the trained English-Chinese neural machine translation model based on the attention machine system to retranslate English in the English-Vietnamese parallel linguistic data into Chinese, thereby obtaining Chinese-English-Vietnamese parallel linguistic data;
step2.2, replacing the rare words in the Vietnamese corpus to expand the Chinese-English-Vietnamese parallel corpus by using a data enhancement method for the Chinese-English-Vietnamese parallel corpus obtained in the step 2.1.
Step3, training a Chinese-English neural machine translation model and an English-crossing neural machine translation model, and initializing parameters of the Chinese-crossing neural machine translation model by using parameters of a pre-training model;
as a preferred embodiment of the present invention, the Step3 comprises the following specific steps:
in order to solve the problem that a source language is expressed into a vector with a fixed length in a neural machine translation model, but the fixed-length vector cannot sufficiently express the relation between semantic information and context of a source language sentence; introducing an attention mechanism in the trained neural machine translation model;
step3.1, training a neural machine translation model with an attention mechanism respectively by using Chinese-English and English-crossing parallel linguistic data to obtain a Chinese-English neural machine translation model with the attention mechanism and an English-crossing neural machine translation model respectively;
step3.2, initializing the encoder and decoder parameters of the Chinese-Yuetu neural machine translation model by using the Chinese encoder parameters of the Chinese-English neural machine translation model and the Vietnamese decoder parameters of the English-Yuetu neural machine translation model.
And Step4, carrying out fine tuning training on the initialized Hanyue neural machine translation model by using the Hanyue parallel corpus to obtain the Hanyue neural machine translation model for carrying out Hanyue neural machine translation.
Because corpus resources are scarce, large-scale Chinese-crossing parallel corpus is not available, so that the semantic representation of the encoder for Chinese-crossing neural machine translation is poor to influence the Chinese-crossing neural machine translation performance. The large-scale Chinese-English parallel linguistic data and English-English parallel linguistic data exist, and parameters of a neural machine translation model trained by the Chinese-English parallel linguistic data and English-English parallel linguistic data can be used for the idea of transfer learning;
in the Step 3:
the neural machine translation model represents a source language sentence as a fixed vector. The method has the disadvantage that the fixed-length vector cannot fully express the relationship between the semantic information and the context of the source language sentence. The attention mechanism allows a neural network to focus on only a portion of the information of the neural network input, which allows selection of a particular input. The neural machine translation based on the attention mechanism firstly codes source language sentences into vector sequences, and secondly dynamically searches for source language word information related to the generated words through the attention mechanism when generating a target language, so that the expression capability of the neural network machine translation is greatly enhanced.
Neural machine translation is based on a data-driven language conversion process, and the performance of the neural machine translation depends on the scale and quality of parallel corpora. The scale and quality of the Chinese-to-Vietnam parallel corpus are limited, so that training data are insufficient, and parameters of a coder-decoder cannot be optimized. The transfer learning can apply the learned knowledge to similar tasks. In the task under the low-resource scene, the rule parameters obtained by the high-resource task are used to improve the performance of the low-resource task, so that the data volume required by the task can be reduced. Therefore, the invention pre-trains the attention-based neural machine translation model of Chinese-English and English-Vietnamese by using large-scale Chinese-English and English-Vietnamese materials, and initializes the parameters of the encoder and the decoder of the attention-based neural machine translation model by using the Chinese encoder and the Vietnamese decoder.
The beneficial effects of the invention are:
1. firstly, chinese-English parallel linguistic data are obtained by a method of retracing and data enhancement by Chinese-English parallel linguistic data, and are added into training linguistic data, so that parameters of a next initialization model are more relevant;
2. the invention uses Chinese-English-Vietnam parallel corpus to pre-train the neural machine translation model, and initializes the encoder and decoder parameters of the Chinese-English-Vietnam neural machine translation model by using the parameters of the Chinese encoder and Vietnam decoder, so that the model of the Chinese-English-Vietnam neural machine translation model can not be trained by the parameters initialized along with the level at the beginning, and semantic information can be more accurately expressed. Finally, fine tuning training is carried out by using small-scale Hanyue speech materials to obtain a Hanyue neural machine translation model, optimization training can be carried out on the initialized Hanyue neural machine translation model, and the Hanyue neural machine translation performance can be effectively improved;
3. the invention adopts the idea of transfer learning, so that the encoder for the Chinese-crossing neural machine translation can better represent the semantic information of the source language, and the decoding effect is better.
Drawings
FIG. 1 is a detailed flow chart of the present invention;
fig. 2 is a flow chart of the training process of the hanyue neural machine translation based on the transfer learning proposed by the present invention.
Detailed Description
Example 1: as shown in fig. 1-2, a method for machine translation of hanyue nerve based on transfer learning includes the following steps:
step1, crawling training corpora by using a crawler, wherein the crawled training corpora have 10 ten thousand sentence pairs of Chinese-Yue corpora; 70 ten thousand sentence pairs of English-Vietnamese material specification; 5000 ten thousand sentence pairs of Chinese-English corpus scale; manually screening the crawled corpus and then filtering the crawled corpus in a messy code mode; extracting a part of the training data to be used as a test set and a verification set;
and (3) manually screening the crawled linguistic data, then segmenting the crawled linguistic data, and replacing Arabic numerals with num and messy code for filtering.
Step2, in the existing data set of Chinese-English and English-Vietnamese, a method for retranslating axial language and English is used, firstly, a 4-layer attention-based neural machine translation system with a word list of 32000 trains an attention-based English-Chinese neural machine translation model by adopting large-scale English-Chinese parallel linguistic data, and secondly, the trained attention-based English-Chinese neural machine translation model retranslates English in English-Vietnamese parallel linguistic data into Chinese, so that Chinese-English-Vietnamese parallel linguistic data are obtained;
replacing rare words in the Vietnamese corpus to expand the Chinese-English-Vietnamese parallel corpus by using a data enhancement method for the Chinese-English-Vietnamese parallel corpus obtained in the step 2.1; the occurrence frequency of rare words in the Vietnamese corpus is set to be 20, only one rare word is replaced each time, and the rare words in the sentence pairs are replaced to expand the Chinese-English-Vietnamese parallel corpus;
step3, training a Chinese-English neural machine translation model and an English-crossing neural machine translation model, and initializing parameters of the Chinese-crossing neural machine translation model by using parameters of a pre-training model;
in order to solve the problem that a source language is expressed into a vector with a fixed length in a neural machine translation model, but the fixed-length vector cannot sufficiently express the relation between semantic information and context of a source language sentence; introducing an attention mechanism in a trained neural machine translation model;
as a preferred embodiment of the present invention, the Step3 comprises the following specific steps:
step3.1, respectively training a neural machine translation model with an attention mechanism by using Chinese-English and English-crossing parallel linguistic data to respectively obtain a Chinese-English neural machine translation model with the attention mechanism and an English-crossing neural machine translation model;
as shown in FIG. 2, first, two models (Pre-train Model A, pre-train Model B) are obtained by training with Chinese-English parallel linguistic data and English-English parallel linguistic data. In both the Chinese-English neural machine translation model and the English-English neural machine translation model training with attention mechanism, the sequence of a given source language word is represented as
Figure BDA0002167308240000051
The sequence of target language words is represented as
Figure BDA0002167308240000052
Let GloVe (w) x ) Is corresponding to w x And let z be the GloVe vector corresponding to W z The random initialization word vector of the word in (a). GloVe (w) x ) The LSTM (Long Short-Term Memory Network) for the dual layer, bi-directional, is called NMT-LSTM and is used to compute the hidden state sequence.
h=NMT-LSTM(GloVe(w x )) (1)
In this machine translation model, NMT-LSTM provides an attention-driven decoding network that solves for each stage based on context vectors
Figure BDA0002167308240000053
The conditional probability.
In stagesIn t, based on previously embedded z t-1 The decoder first uses the LSTM of the unidirectional bilayer and the implicit state of the adapted context
Figure BDA0002167308240000054
To obtain a hidden state
Figure BDA0002167308240000055
The details are as follows:
Figure BDA0002167308240000056
the decoder calculates an attention weight vector a for each encoding stage's correlation with the current decoder state.
Figure BDA0002167308240000057
H is the accumulation of H over a time step,
Figure BDA0002167308240000058
the method is based on weighted summation of state weights of a decoding end of an attention mechanism and then nonlinear activation by using tanh, and the specific formula is as follows
Figure BDA0002167308240000059
The probability distribution of the output word is generated by the final transition of the hidden state of the context:
Figure BDA00021673082400000510
step3.2, when training the Chinese to Vietnamese neural machine translation model, adopting the Chinese encoder parameters of the Chinese-English neural machine translation model to initialize the encoder parameters of the Chinese-Vietnamese neural machine translation model, and adopting the Vietnamese decoder parameters of the Chinese-Vietnamese neural machine translation model to initialize the decoder parameters of the Chinese-Vietnamese neural machine translation model.
And Step4, carrying out fine tuning training on the initialized Hanyue neural machine translation model by using the Hanyue parallel corpus to obtain the Hanyue neural machine translation model for carrying out Hanyue neural machine translation.
And (3) performing Fine-tune (Fine-tune Model C) training on the Model after the parameters are initialized by adopting the Chinese-Yue parallel corpus to obtain a Chinese-Yue neural machine translation Model. Table 1 shows the results of comparing the bler values of the baseline system and the transition Learning-based chinese-yuans Neural Machine Translation model (TLNMT) in both the chinese-yuans and the vietnamese-chinese Translation directions, and table 2 shows an example of comparing the baseline system and the transition Learning-based chinese-yuans Neural Machine Translation model (TLNMT) in the chinese-yuans Translation direction.
Table 1 shows the BLEU comparison results of different models
Figure BDA0002167308240000061
Table 2 is an example of translations for different models
Figure BDA0002167308240000062
Compared with experimental results, the TLNMT method for the bilingual neural machine translation in Hanyue has obviously better effect than other methods. Compared with an NMT method, the TLNMT method improves 4.48 BLEU values in the Hanyue translation direction, and improves 1.66 BLEU values in the Hanyue translation direction. Compared with an OPENNMT model, TLNMT obtains 1.16 BLEU value promotion in the Han-Yuan translation direction, and obtains 1.05 BLEU value promotion in the Han-Yuan translation direction.
From the first sentence group in table 2, it can be seen that the translation of OpenNMT has the phenomenon of inaccurate sentence, in which the Hubble and the trace are not translated
Figure BDA0002167308240000071
Words and phrases. In-process trainingIn the aggregate and test set, the numbers are replaced by num, and in the second group of sentences, the OpenNMT translation has more missed translation situations such as edges than the first group of data
Figure BDA0002167308240000072
Diffraction of
Figure BDA0002167308240000073
Soft and soft
Figure BDA0002167308240000074
And so on, and for the "num" data of the source sentence, it does not appear in the OpenNMT translation, but appears in the hanyu TLNMT translation. The reason for the above problems is that the missing words appear less frequently in the training corpus, and the neural machine translation model cannot well learn the semantic representation of the low-frequency words, so that the missing situation occurs. The invention adopts the ideas of transfer learning and pivot language, so that the encoder for the Chinese-transcendental neural machine translation can better express the semantic information of the source language, the decoding effect is better, and the TLNMT has better translation effect.
While the present invention has been described in detail with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.

Claims (3)

1. The method for translating the Hanyue neural machine based on the transfer learning is characterized by comprising the following steps of:
the method comprises the following specific steps:
step1, corpus collection and pretreatment: collecting and preprocessing parallel corpora of Chinese-Yue, english-Yue and Chinese-English sentence pairs;
step2, generating Chinese-English-more-three-language parallel linguistic data by using the Chinese-English-more-parallel linguistic data and the English-more-parallel linguistic data;
step3, training a Chinese-English neural machine translation model and an English-crossing neural machine translation model, and initializing parameters of the Chinese-crossing neural machine translation model by using parameters of a pre-training model;
the Step3 is specifically as follows:
introducing an attention mechanism into a trained neural machine translation model, respectively training the neural machine translation model with the attention mechanism by using Chinese-English and English-crossing parallel linguistic data to respectively obtain a Chinese-English neural machine translation model with the attention mechanism and an English-crossing neural machine translation model, and then initializing an encoder and a decoder parameter of the Chinese-crossing neural machine translation model by using a Chinese encoder parameter of the Chinese-English neural machine translation model and a Vietnamese decoder parameter of the English-crossing neural machine translation model;
and Step4, carrying out fine tuning training on the initialized Hanyue neural machine translation model by using the Hanyue parallel corpus to obtain the Hanyue neural machine translation model for carrying out Hanyue neural machine translation.
2. The method for machine translation of hanyue based on transfer learning of claim 1, wherein: the concrete steps of Step1 are as follows:
step1.1, crawling Chinese-Yue, english-Yue and Chinese-English parallel sentence pairs by using a crawler, and extracting a part of training data to be used as a test set and a verification set;
step1.2, manually screening the crawled linguistic data, then segmenting the crawled linguistic data, and replacing Arabic numerals with num and messy code for filtering.
3. The method for machine translation of hanyu based on transfer learning of claim 1, wherein: the specific steps of Step2 are as follows:
step2.1, in the existing data set of Chinese-English and English-Vietnamese, using a retranslation method for the axial language and English, using English-Chinese parallel linguistic data to train an English-Chinese neural machine translation model based on an attention machine system, and then using the trained English-Chinese neural machine translation model based on the attention machine system to retranslate English in the English-Vietnamese parallel linguistic data into Chinese, thereby obtaining Chinese-English-Vietnamese parallel linguistic data;
step2.2, replacing the rare words in the Vietnamese corpus to expand the Chinese-English-Vietnamese parallel corpus by using a data enhancement method for the Chinese-English-Vietnamese parallel corpus obtained in the step 2.1.
CN201910751450.7A 2019-08-15 2019-08-15 Method for translating Hanyue neural machine based on transfer learning Active CN110472252B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910751450.7A CN110472252B (en) 2019-08-15 2019-08-15 Method for translating Hanyue neural machine based on transfer learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910751450.7A CN110472252B (en) 2019-08-15 2019-08-15 Method for translating Hanyue neural machine based on transfer learning

Publications (2)

Publication Number Publication Date
CN110472252A CN110472252A (en) 2019-11-19
CN110472252B true CN110472252B (en) 2022-12-13

Family

ID=68511726

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910751450.7A Active CN110472252B (en) 2019-08-15 2019-08-15 Method for translating Hanyue neural machine based on transfer learning

Country Status (1)

Country Link
CN (1) CN110472252B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111104807B (en) * 2019-12-06 2024-05-24 北京搜狗科技发展有限公司 Data processing method and device and electronic equipment
CN111178094B (en) * 2019-12-20 2023-04-07 沈阳雅译网络技术有限公司 Pre-training-based scarce resource neural machine translation training method
CN111680520A (en) * 2020-04-30 2020-09-18 昆明理工大学 Synonym data enhancement-based Hanyue neural machine translation method
CN112287694A (en) * 2020-09-18 2021-01-29 昆明理工大学 Shared encoder-based Chinese-crossing unsupervised neural machine translation method
CN112257460B (en) * 2020-09-25 2022-06-21 昆明理工大学 Pivot-based Hanyue combined training neural machine translation method
CN112215017B (en) * 2020-10-22 2022-04-29 内蒙古工业大学 Mongolian Chinese machine translation method based on pseudo parallel corpus construction
CN112633018B (en) * 2020-12-28 2022-04-15 内蒙古工业大学 Mongolian Chinese neural machine translation method based on data enhancement
CN113239708B (en) * 2021-04-28 2023-06-20 华为技术有限公司 Model training method, translation method and device
CN113657122B (en) * 2021-09-07 2023-12-15 内蒙古工业大学 Mongolian machine translation method of pseudo parallel corpus integrating transfer learning

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5787386A (en) * 1992-02-11 1998-07-28 Xerox Corporation Compact encoding of multi-lingual translation dictionaries
CN102111160A (en) * 2010-11-23 2011-06-29 中国科学技术大学 Coding and decoding system and codec for reactive system test
CN107092594A (en) * 2017-04-19 2017-08-25 厦门大学 Bilingual recurrence self-encoding encoder based on figure
CN108363704A (en) * 2018-03-02 2018-08-03 北京理工大学 A kind of neural network machine translation corpus expansion method based on statistics phrase table
CN108536687A (en) * 2018-04-20 2018-09-14 王立山 Method and system based on the mind over machine language translation like predicate calculus form
CN108829684A (en) * 2018-05-07 2018-11-16 内蒙古工业大学 A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy
CN109117483A (en) * 2018-07-27 2019-01-01 清华大学 The training method and device of neural network machine translation model
CN109213851A (en) * 2018-07-04 2019-01-15 中国科学院自动化研究所 Across the language transfer method of speech understanding in conversational system
CN109446535A (en) * 2018-10-22 2019-03-08 内蒙古工业大学 A kind of illiteracy Chinese nerve machine translation method based on triangle framework
US10268685B2 (en) * 2015-08-25 2019-04-23 Alibaba Group Holding Limited Statistics-based machine translation method, apparatus and electronic device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8645289B2 (en) * 2010-12-16 2014-02-04 Microsoft Corporation Structured cross-lingual relevance feedback for enhancing search results

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5787386A (en) * 1992-02-11 1998-07-28 Xerox Corporation Compact encoding of multi-lingual translation dictionaries
CN102111160A (en) * 2010-11-23 2011-06-29 中国科学技术大学 Coding and decoding system and codec for reactive system test
US10268685B2 (en) * 2015-08-25 2019-04-23 Alibaba Group Holding Limited Statistics-based machine translation method, apparatus and electronic device
CN107092594A (en) * 2017-04-19 2017-08-25 厦门大学 Bilingual recurrence self-encoding encoder based on figure
CN108363704A (en) * 2018-03-02 2018-08-03 北京理工大学 A kind of neural network machine translation corpus expansion method based on statistics phrase table
CN108536687A (en) * 2018-04-20 2018-09-14 王立山 Method and system based on the mind over machine language translation like predicate calculus form
CN108829684A (en) * 2018-05-07 2018-11-16 内蒙古工业大学 A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy
CN109213851A (en) * 2018-07-04 2019-01-15 中国科学院自动化研究所 Across the language transfer method of speech understanding in conversational system
CN109117483A (en) * 2018-07-27 2019-01-01 清华大学 The training method and device of neural network machine translation model
CN109446535A (en) * 2018-10-22 2019-03-08 内蒙古工业大学 A kind of illiteracy Chinese nerve machine translation method based on triangle framework

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
A Correlational Encoder Decoder Architecture for Pivot Based Sequence Generation;Amrita Saha等;《arxiv.org/abs/1606.04754》;20160615;全文 *
Multilingual Neural Machine Translation for Low-Resource Languages;Surafel M.Lakew等;《Italian Journal of Computational Linguistics》;20180401;第4卷(第1期);第11页-25页 *
基于枢轴语言的多语言神经机器翻译研究;刘清民等;《科学技术创新》;20190215;第86页-87页 *
基于跨语言词向量模型的蒙汉查询词扩展方法研究;马路佳等;《中文信息学报》;20190615;第33卷(第6期);第27页-34页 *
神经机器翻译综述;李亚超等;《计算机学报》;20181215;第41卷(第12期);第2735页-2755页 *

Also Published As

Publication number Publication date
CN110472252A (en) 2019-11-19

Similar Documents

Publication Publication Date Title
CN110472252B (en) Method for translating Hanyue neural machine based on transfer learning
CN110334361B (en) Neural machine translation method for Chinese language
CN107357789B (en) Neural machine translation method fusing multi-language coding information
CN111382580B (en) Encoder-decoder framework pre-training method for neural machine translation
CN109359294B (en) Ancient Chinese translation method based on neural machine translation
CN111178094B (en) Pre-training-based scarce resource neural machine translation training method
CN109271643A (en) A kind of training method of translation model, interpretation method and device
CN111916067A (en) Training method and device of voice recognition model, electronic equipment and storage medium
CN112287688B (en) English-Burmese bilingual parallel sentence pair extraction method and device integrating pre-training language model and structural features
CN108829684A (en) A kind of illiteracy Chinese nerve machine translation method based on transfer learning strategy
CN111783462A (en) Chinese named entity recognition model and method based on dual neural network fusion
CN111241816B (en) Automatic news headline generation method
CN113283244B (en) Pre-training model-based bidding data named entity identification method
CN110688862A (en) Mongolian-Chinese inter-translation method based on transfer learning
CN110163181B (en) Sign language identification method and device
CN114757182A (en) BERT short text sentiment analysis method for improving training mode
CN111581970B (en) Text recognition method, device and storage medium for network context
CN104462072A (en) Input method and device oriented at computer-assisting translation
CN112464676A (en) Machine translation result scoring method and device
CN110569505A (en) text input method and device
CN110427629A (en) Semi-supervised text simplified model training method and system
CN113190656A (en) Chinese named entity extraction method based on multi-label framework and fusion features
CN109145946B (en) Intelligent image recognition and description method
CN111666756A (en) Sequence model text abstract generation method based on topic fusion
CN113609284A (en) Method and device for automatically generating text abstract fused with multivariate semantics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant