CN108549644A - Omission pronominal translation method towards neural machine translation - Google Patents
Omission pronominal translation method towards neural machine translation Download PDFInfo
- Publication number
- CN108549644A CN108549644A CN201810326895.6A CN201810326895A CN108549644A CN 108549644 A CN108549644 A CN 108549644A CN 201810326895 A CN201810326895 A CN 201810326895A CN 108549644 A CN108549644 A CN 108549644A
- Authority
- CN
- China
- Prior art keywords
- pronoun
- language material
- missing
- word alignment
- machine translation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
Abstract
The present invention relates to a kind of language material processing method using neural machine translation system omit pronoun, apply in the NMT models based on attention mechanism and use encoder decoder frames, including:Obtain original language material;Word alignment is carried out to the language material of acquisition, obtains the Position Approximate of missing pronoun;All possible pronoun is put into the position of all possible missing;Most suitable pronoun and most suitable position are selected using language model;Word alignment is carried out again, changes the position of supplement missing pronoun into pronoun in respective objects sentence;SequenceLabeling marking models are trained using the training corpus supplemented.The above-mentioned language material processing method for using neural machine translation system omit pronoun can be automatically replenished the pronoun omitted in source statement and avoid using generated ambiguity after source language supplement source statement missing pronoun, to effectively improve translation quality.Further relate to a kind of interpretation method using neural machine translation system.
Description
Technical field
The present invention relates to neural machine translation, more particularly to the omission pronominal translation method towards neural machine translation.
Background technology
With the raising of computer computation ability and the application of big data, deep learning obtains further application.Base
It is had been to be concerned by more and more people in the Neural Machine Translation of deep learning.It is the most frequently used in the fields NMT
A kind of translation model be the encoder-decoder models with attention mechanism (attention-based).It is mainly thought
Think it is that sentence (hereinafter collectively referred to as ' source statement ') to be translated is become into a vector by encoder (encoder) coding
It indicates, then utilizes decoder (decoder) to indicate to be decoded to the vector of source statement, translation becomes its corresponding translation
(hereinafter collectively referred to as ' object statement ').Although the Neural Machine Translation based on deep learning are certain
It can be good at translating source statement in degree, but for a kind of habitual uttered sentence for omitting pronoun is translated another kind
What is but showed on the non-language for omitting pronoun is not so outstanding, literary " eaten for example, we are often right", and correspond to
English should be " have you eaten", but reality carries attention mechanism (attention-based) with general
Encoder-decoder model translations come out sentence be " eaten", as turning over for this uttered sentence for omitting pronoun
It translates, because ' you ' is omitted in Chinese, and the custom that English does not omit, therefore carry out translation using machine and can drop significantly
The fluency and readability of low translation, to influence translation quality.
It is existing to solve the problems, such as that this kind of method has following 2 kinds:
1. the pronoun omitted in artificial polishing source statement;
2. automatically supplementing the pronoun omitted in source statement with source language, the specific method is as follows:
First, source statement and object statement are carried out pair using word alignment (corresponding the word in two sentences)
Neat operation, the approximate location of missing pronoun is obtained with this;Then, all possible pronoun is filled into all possible missing
Position;Finally, (judge whether a sentence has the fluency of normal statement, puzzlement degree is smaller, then more connects using language model
Near-nature forest language) come select be most suitable for supplement pronoun and position.
Then the processed language material of both the above method is utilized to carry attention mechanism (attention- by above-mentioned
Based encoder-decoder models) are translated.
The shortcomings that for existing two classes technology:
The first (i.e. artificial supplementation):The language material limited amount for taking, effort, and handling.
Second (being automatically replenished):Although this kind of solution can solve disadvantage caused by artificial supplementation,
It is that the pronoun that this compensation process supplemented easily causes ambiguity when being translated, i.e.,:For the Chinese pronoun ' I ' of supplement,
Can be ' I ' or ' me ' when translating English, the ambiguity generated by this one-to-many word can make translation quality decline.
Invention content
In view of the above technology there are the shortcomings that, therefore, to solve this problem, we have proposed a kind of method, can from
The pronoun omitted in dynamic supplement source statement, and can avoid utilizing generated discrimination after source language supplement source statement missing pronoun
Justice, to effectively improve translation quality.
A kind of language material processing method using neural machine translation system omit pronoun, is applied based on attention machine
System and the NMT models for using encoder-decoder frames, including:
Obtain original language material;
Word alignment is carried out to the language material of acquisition, obtains the Position Approximate of missing pronoun;
All possible pronoun is put into the position of all possible missing;
Most suitable pronoun and most suitable position are selected using language model;
Word alignment is carried out again, changes the position of supplement missing pronoun into pronoun in respective objects sentence;
SequenceLabeling marking models are trained using the training corpus supplemented;
Development set and test set are marked with trained SequenceLabeling marking models and supplement pronoun.
The above-mentioned language material processing method for using neural machine translation system omit pronoun, can be automatically replenished source statement
The pronoun of middle omission, and can avoid using generated ambiguity after source language supplement source statement missing pronoun, to effectively
Raising translation quality.
In other one embodiment, in step " word alignment is carried out to the language material of acquisition, obtains the general of missing pronoun
Position;" in, carry out word alignment using GIZA++ models.
In other one embodiment, " word alignment is carried out again, changes the position of supplement missing pronoun into phase in step
Answer the pronoun in object statement;" in, carry out word alignment using GIZA++ models.
In other one embodiment, in step " word alignment is carried out to the language material of acquisition, obtains the general of missing pronoun
Position;" and step " carry out word alignment again, will supplement missing pronoun position change the pronoun in respective objects sentence into;”
The method of middle word alignment is identical.
In other one embodiment, in step " word alignment is carried out to the language material of acquisition, obtains the general of missing pronoun
Position;" and step " carry out word alignment again, will supplement missing pronoun position change the pronoun in respective objects sentence into;”
The method of middle word alignment is different.
A kind of interpretation method using neural machine translation system is applied based on attention mechanism and using encoder-
The NMT models of decoder frames,
Original language material is handled using the above-mentioned language material processing method for using neural machine translation system omit pronoun;
It is separately added into the first label and the second label before and after source statement supplements pronoun;
First label and second label is equally added in the position that object statement corresponds to pronoun;
The training of NMT systems is carried out using the language material after above-mentioned processing;
It is translated using trained NMT systems.
In other one embodiment, first label is<copy>It is with second label</copy>.
In other one embodiment, first label is</copy>It is with second label<copy>.
A kind of computer equipment, including memory, processor and storage can be run on a memory and on a processor
Computer program, which is characterized in that the step of processor realizes any one the method when executing described program.
A kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
The step of any one the method is realized when execution.
Description of the drawings
Fig. 1 is that a kind of language material using neural machine translation system omit pronoun provided by the embodiments of the present application is handled
The schematic diagram of method.
Fig. 2 is that a kind of language material using neural machine translation system omit pronoun provided by the embodiments of the present application is handled
The flow chart of method.
Fig. 3 is a kind of flow chart of interpretation method using neural machine translation system provided by the embodiments of the present application.
Fig. 4 is that a kind of language material using neural machine translation system omit pronoun provided by the embodiments of the present application is handled
One of effect diagram of method.
Fig. 5 is that a kind of language material using neural machine translation system omit pronoun provided by the embodiments of the present application is handled
The two of the effect diagram of method.
Specific implementation mode
In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.
The application foundation of the application is introduced first:NMT models based on attention mechanism (attention).
In neural machine translation system, translation is generally realized using encoder-decoder frames.To training corpus
In each word, we are that it initializes a term vector, and the term vector of all words constitutes term vector dictionary.Word to
Amount, the vector of usually one multidimensional, often one-dimensional in vectorial is all a real number, and the size of dimension is generally according in experimentation
Result finally determine.For example, for word " we ", its term vector may be<0.12, -0.23 ..., 0.99>.
Encoder is made of two-way RNN (RecurentNeural Network) network.In the encoder stages,
Encoder reads in a sentence, and sentence is encoded into a series of vector.Detailed process is as follows, first by a sentence expression
For the sequence of term vector, i.e. x=<x1, x2..., xT>, wherein x is the sentence of input, xjFor i-th of word in sentence word to
Amount, the i.e. vector of m dimensions.Forward direction RNN is according to formulaWe can obtain one and are made of hidden layer vector
Forward direction sequence vectorReversed RNN can be obtained according to same principle by the anti-of hidden layer Vector Groups layer
To sequence vectorWe connectWithAs word xjContain context after encoder is encoded
The vector of information indicatesBy hidden layer sequence vector<h1, h2..., hT>, we can obtain context vectors ct
=q ({ h1, h2..., hT}).Wherein, wherein hj∈Rn, hidden state when being sequential t, f and q are nonlinear activation primitives,
Wherein f generally uses GRU or LSTM, q generally to use attention networks.
etj=a (st-1, hj)
In classical neural machine translation system, context vectors are generally obtained using attention networks, can
To be obtained by following formula operation:
Wherein, a is one one layer of feedforward network, αtjBe encoder it is each hidden state hjWeight.
Decoder is also to be made of RNN networks.In the Decoder stages, vector c is giventAnd it is all predicted
To word { y1, y2..., yt′-1, it can continue to predict yt, can be done step-by-step by such as giving a definition:WhereinIn addition, p (yt|{y1, y2...,
yt-1, ct)=g (yt-1, st, ct), wherein g is nonlinear activation function, generally uses softmax functions.stIt is hidden in RNN
Layer state, st=f (yt-1, st-1, ct)。
The characteristics of encoder and decoder uses RNN networks, is primarily due to its feature, RNN networks be, hidden layer
State is codetermined by current input and a upper hidden layer state.Such as in this nerve machine translation process, the Encoder stages
Hidden layer state is codetermined by the term vector of source language end current word and a upper hidden layer state.The hidden layer state in Decoder stages
It is codetermined by a hidden layer state on the target language terminal word vector sum that is calculated in previous step.
The training of model generally uses minimum to bear log-likelihood as loss function, uses stochastic gradient descent for training side
Method is iterated training.In training setOn, wherein xn, ynFor parallel sentence pair, model training object function is such as
Under:
A kind of language material processing method using neural machine translation system omit pronoun, is applied based on attention machine
System and the NMT models for using encoder-decoder frames, including:
Obtain original language material;
Word alignment is carried out to the language material of acquisition, obtains the Position Approximate of missing pronoun;
All possible pronoun is put into the position of all possible missing;
Most suitable pronoun and most suitable position are selected using language model;
Word alignment is carried out again, changes the position of supplement missing pronoun into pronoun in respective objects sentence;
SequenceLabeling marking models are trained using the training corpus supplemented;
Development set and test set are marked with trained SequenceLabeling marking models and supplement pronoun.
The above-mentioned language material processing method for using neural machine translation system omit pronoun, can be automatically replenished source statement
The pronoun of middle omission, and can avoid using generated ambiguity after source language supplement source statement missing pronoun, to effectively
Raising translation quality.
In other one embodiment, in step " word alignment is carried out to the language material of acquisition, obtains the general of missing pronoun
Position;" in, carry out word alignment using GIZA++ models.
In other one embodiment, " word alignment is carried out again, changes the position of supplement missing pronoun into phase in step
Answer the pronoun in object statement;" in, carry out word alignment using GIZA++ models.
In other one embodiment, in step " word alignment is carried out to the language material of acquisition, obtains the general of missing pronoun
Position;" and step " carry out word alignment again, will supplement missing pronoun position change the pronoun in respective objects sentence into;”
The method of middle word alignment is identical.
In other one embodiment, in step " word alignment is carried out to the language material of acquisition, obtains the general of missing pronoun
Position;" and step " carry out word alignment again, will supplement missing pronoun position change the pronoun in respective objects sentence into;”
The method of middle word alignment is different.
A kind of interpretation method using neural machine translation system is applied based on attention mechanism and using encoder-
The NMT models of decoder frames,
Original language material is handled using the above-mentioned language material processing method for using neural machine translation system omit pronoun;
It is separately added into the first label and the second label before and after source statement supplements pronoun;
First label and second label is equally added in the position that object statement corresponds to pronoun;
The training of NMT systems is carried out using the language material after above-mentioned processing;
It is translated using trained NMT systems.
In other one embodiment, first label is<copy>It is with second label</copy>.
In other one embodiment, first label is</copy>It is with second label<copy>.
A kind of computer equipment, including memory, processor and storage can be run on a memory and on a processor
Computer program, which is characterized in that the step of processor realizes any one the method when executing described program.
A kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
The step of any one the method is realized when execution.
The specific application scenarios of the present invention are described below:
The processing of training set language material:
Refering to fig. 1 and Fig. 2, the pronoun of destination end is added to the position that source lacks pronoun by us.Because of training set language
Material is parallel corpora, and therefore, we can utilize alignment information.It uses GIZA++ to carry out word alignment first, obtains missing pronoun
Position Approximate, then all possible pronoun is put into be possible to missing position, then utilize language model, select
Most suitable pronoun and most suitable position after picking out best pronoun and position, then with a GIZA++ model carry out word pair
Together, the position of supplement missing pronoun is changed into pronoun in respective objects sentence.
For example, if source sentence is:It " has eaten", then the sentence for supplementing destination end pronoun is exactly:" you is eaten
”.
Test set and the processing of development set language material:
For development set and test set, because the language material of the two set is not parallel, there is no destination end sentence,
Therefore we cannot be handled using the method for processing training set.Here the handling development set and test set of the task is regarded as
It is a part-of-speech tagging problem, the type of mark shares 32 classes, corresponds to each pronoun and sky (represent and lacked without pronoun) respectively,
With the Foolnltk kits increased income, part-of-speech tagging model is trained using the training set language material handled well, then to test set and
Development set is handled, and the example after processing is as above.
Refering to Fig. 3, we are added copy mechanism in above-mentioned NMT models here, that is, for training set, test set and
Development set tries again processing, and label is separately added into before and after source statement supplements pronoun<copy>,</copy>.Meanwhile target
Same label is also added in the position that sentence corresponds to pronoun, and the training of NMT systems is carried out using the language material after processing.Wherein I
" you " in " you " and trg in src share identical word-embedding.NMT systems can learn the ends src<
copy>……</copy>With the ends trg<copy>……</copy>Correspondence, shared word-embedding can ensure
Generate the correctness of translation.
It is proposed that method can not only supplement the missing problem for omitting pronoun, but also can avoid filling into missing pronoun
The lexical translation ambiguity problem introduced afterwards, to effectively promote the translation quality for the conversational language for omitting pronoun.
Passing through various experiments, it has been found that the translation filled into after pronoun is obviously improved compared to the translation effect not filled into,
The translation effect that destination end pronoun (+ProDrop_target) acquisition is filled into our method is directly filled into than what forefathers proposed
The translation of the pronoun (+ProDrop) of source language wants better, and BLEU values nearly 1 point of promotion illustrates our method
The translation for the spoken utterance for omitting pronoun can largely be promoted.Experimental result is as shown in table 1 below:
Table 1
Below for a specific example:
Such as:src:I have used (I) a lifetime.
Trg:I spend my whole life.
For the example provided, pronoun ' I ' is omitted in original sentence, with it is proposed that method, be first word alignment behaviour
Make, obtains refering to effect shown in Fig. 4:
Then according to the word of front and back alignment, we probably judge the position for lacking pronoun ' using ' and ' a lifetime '
Near, 31 pronouns are all put into these position candidates, using language model select it is most matched that, it is assumed that we judge
It is best in ' using ' and ' a lifetime ' intermediate effect to go out to be added pronoun ' I ', then we are using this sentence as final time
Then choosing utilizes word alignment, obtains refering to effect shown in fig. 5 again.
At this moment it is obvious that originally the part of missing pronoun has all been aligned, therefore, we only needed the Chinese generation of supplement
Word with English end pronoun supplemented to get to:
Src:I has used my a lifetime.
Trg:I spendmywhole life.
Final step is separately added into before and after the pronoun of supplement<copy>With</copy>Label obtains:
Src:I uses<copy>my</copy>All one's life.
Trg:I spend<copy>my</copy>whole life.
Then by treated, language material gives that the Encoder-Decoder models based on attention are trained and turn over
It translates.
Each technical characteristic of embodiment described above can be combined arbitrarily, to keep description succinct, not to above-mentioned reality
It applies all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited
In contradiction, it is all considered to be the range of this specification record.
Several embodiments of the invention above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously
It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art
It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the protection of the present invention
Range.Therefore, the protection domain of patent of the present invention should be determined by the appended claims.
Claims (10)
1. a kind of language material processing method using neural machine translation system omit pronoun, is applied based on attention mechanism
And using the NMT models of encoder-decoder frames, which is characterized in that including:
Obtain the original language material;
Word alignment is carried out to the language material of acquisition, obtains the Position Approximate of missing pronoun;
All possible pronoun is put into the position of all possible missing;
Most suitable pronoun and most suitable position are selected using language model;
Word alignment is carried out again, changes the position of supplement missing pronoun into pronoun in respective objects sentence;
SequenceLabeling marking models are trained using the training corpus supplemented;
Development set and test set are marked with trained SequenceLabeling marking models and supplement pronoun.
2. the language material processing method according to claim 1 for using neural machine translation system omit pronoun, special
Sign is, " carries out word alignment in step to the language material of acquisition, obtains the Position Approximate of missing pronoun;" in, utilize GIZA++ moulds
Type carries out word alignment.
3. the language material processing method according to claim 1 for using neural machine translation system omit pronoun, special
Sign is, " carries out word alignment again in step, changes the position of supplement missing pronoun into pronoun in respective objects sentence;”
In, carry out word alignment using GIZA++ models.
4. the language material processing method according to claim 1 for using neural machine translation system omit pronoun, special
Sign is, " carries out word alignment in step to the language material of acquisition, obtains the Position Approximate of missing pronoun;" and step " again into
Row word alignment changes the position of supplement missing pronoun into pronoun in respective objects sentence;" in word alignment method it is identical.
5. the language material processing method according to claim 1 for using neural machine translation system omit pronoun, special
Sign is, " carries out word alignment in step to the language material of acquisition, obtains the Position Approximate of missing pronoun;" and step " again into
Row word alignment changes the position of supplement missing pronoun into pronoun in respective objects sentence;" in word alignment method it is different.
6. a kind of interpretation method using neural machine translation system is applied based on attention mechanism and using encoder-
The NMT models of decoder frames, which is characterized in that
Using neural machine translation system omit the language material processing side of pronoun using claim 1 to 5 any one of them
Method handles original language material;
It is separately added into the first label and the second label before and after source statement supplements pronoun;
First label and second label is equally added in the position that object statement corresponds to pronoun;
The training of NMT systems is carried out using the language material after above-mentioned processing;
It is translated using trained NMT systems.
7. the interpretation method according to claim 6 using neural machine translation system, which is characterized in that first mark
Label are<copy>It is with second label</copy>.
8. the interpretation method according to claim 6 using neural machine translation system, which is characterized in that first mark
Label are</copy>It is with second label<copy>.
9. a kind of computer equipment, including memory, processor and storage are on a memory and the meter that can run on a processor
Calculation machine program, which is characterized in that the processor realizes any one of claim 1-8 the methods when executing described program
The step of.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
The step of claim 1-8 any one the methods are realized when execution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810326895.6A CN108549644A (en) | 2018-04-12 | 2018-04-12 | Omission pronominal translation method towards neural machine translation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810326895.6A CN108549644A (en) | 2018-04-12 | 2018-04-12 | Omission pronominal translation method towards neural machine translation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108549644A true CN108549644A (en) | 2018-09-18 |
Family
ID=63514808
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810326895.6A Pending CN108549644A (en) | 2018-04-12 | 2018-04-12 | Omission pronominal translation method towards neural machine translation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108549644A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109948166A (en) * | 2019-03-25 | 2019-06-28 | 腾讯科技(深圳)有限公司 | Text interpretation method, device, storage medium and computer equipment |
CN110598222A (en) * | 2019-09-12 | 2019-12-20 | 北京金山数字娱乐科技有限公司 | Language processing method and device, and training method and device of language processing system |
WO2020197504A1 (en) * | 2019-03-28 | 2020-10-01 | Agency For Science, Technology And Research | A method for pre-processing a sequence of words for neural machine translation |
CN112257460A (en) * | 2020-09-25 | 2021-01-22 | 昆明理工大学 | Pivot-based Hanyue combined training neural machine translation method |
CN114595700A (en) * | 2021-12-20 | 2022-06-07 | 昆明理工大学 | Zero-pronoun and chapter information fused Hanyue neural machine translation method |
WO2022116841A1 (en) * | 2020-12-04 | 2022-06-09 | 北京有竹居网络技术有限公司 | Text translation method, apparatus and device, and storage medium |
US20220215177A1 (en) * | 2018-07-27 | 2022-07-07 | Beijing Jingdong Shangke Information Technology Co., Ltd. | Method and system for processing sentence, and electronic device |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101187922A (en) * | 2006-11-17 | 2008-05-28 | 徐赞国 | Precision machine translation method and its device |
CN107092666A (en) * | 2010-12-30 | 2017-08-25 | 脸谱公司 | System, method and storage medium for network |
EP3210132A1 (en) * | 2014-10-24 | 2017-08-30 | Google, Inc. | Neural machine translation systems with rare word processing |
CN107423290A (en) * | 2017-04-19 | 2017-12-01 | 厦门大学 | A kind of neural network machine translation model based on hierarchical structure |
CN107590138A (en) * | 2017-08-18 | 2018-01-16 | 浙江大学 | A kind of neural machine translation method based on part of speech notice mechanism |
CN107632981A (en) * | 2017-09-06 | 2018-01-26 | 沈阳雅译网络技术有限公司 | A kind of neural machine translation method of introducing source language chunk information coding |
-
2018
- 2018-04-12 CN CN201810326895.6A patent/CN108549644A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101187922A (en) * | 2006-11-17 | 2008-05-28 | 徐赞国 | Precision machine translation method and its device |
CN107092666A (en) * | 2010-12-30 | 2017-08-25 | 脸谱公司 | System, method and storage medium for network |
EP3210132A1 (en) * | 2014-10-24 | 2017-08-30 | Google, Inc. | Neural machine translation systems with rare word processing |
CN107423290A (en) * | 2017-04-19 | 2017-12-01 | 厦门大学 | A kind of neural network machine translation model based on hierarchical structure |
CN107590138A (en) * | 2017-08-18 | 2018-01-16 | 浙江大学 | A kind of neural machine translation method based on part of speech notice mechanism |
CN107632981A (en) * | 2017-09-06 | 2018-01-26 | 沈阳雅译网络技术有限公司 | A kind of neural machine translation method of introducing source language chunk information coding |
Non-Patent Citations (2)
Title |
---|
WANG LONGYUE 等: "Translating Pro-Drop Languages with Reconstruction Models", 《PROCEEDINGS OF THE AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE》 * |
熊德意 等: "计算语义合成性综述", 《中文信息学报》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220215177A1 (en) * | 2018-07-27 | 2022-07-07 | Beijing Jingdong Shangke Information Technology Co., Ltd. | Method and system for processing sentence, and electronic device |
CN109948166A (en) * | 2019-03-25 | 2019-06-28 | 腾讯科技(深圳)有限公司 | Text interpretation method, device, storage medium and computer equipment |
WO2020197504A1 (en) * | 2019-03-28 | 2020-10-01 | Agency For Science, Technology And Research | A method for pre-processing a sequence of words for neural machine translation |
CN110598222A (en) * | 2019-09-12 | 2019-12-20 | 北京金山数字娱乐科技有限公司 | Language processing method and device, and training method and device of language processing system |
CN110598222B (en) * | 2019-09-12 | 2023-05-30 | 北京金山数字娱乐科技有限公司 | Language processing method and device, training method and device of language processing system |
CN112257460A (en) * | 2020-09-25 | 2021-01-22 | 昆明理工大学 | Pivot-based Hanyue combined training neural machine translation method |
CN112257460B (en) * | 2020-09-25 | 2022-06-21 | 昆明理工大学 | Pivot-based Hanyue combined training neural machine translation method |
WO2022116841A1 (en) * | 2020-12-04 | 2022-06-09 | 北京有竹居网络技术有限公司 | Text translation method, apparatus and device, and storage medium |
CN114595700A (en) * | 2021-12-20 | 2022-06-07 | 昆明理工大学 | Zero-pronoun and chapter information fused Hanyue neural machine translation method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108549644A (en) | Omission pronominal translation method towards neural machine translation | |
CN110334361B (en) | Neural machine translation method for Chinese language | |
CN110678881B (en) | Natural language processing using context-specific word vectors | |
CN109117483B (en) | Training method and device of neural network machine translation model | |
Yi et al. | Automatic poetry generation with mutual reinforcement learning | |
Zhang et al. | Lattice transformer for speech translation | |
CN108733837B (en) | Natural language structuring method and device for medical history text | |
CN110134968B (en) | Poem generation method, device, equipment and storage medium based on deep learning | |
CN113158665B (en) | Method for improving dialog text generation based on text abstract generation and bidirectional corpus generation | |
CN106663092A (en) | Neural machine translation systems with rare word processing | |
CN108132932B (en) | Neural machine translation method with replication mechanism | |
US11574142B2 (en) | Semantic image manipulation using visual-semantic joint embeddings | |
CN108563622B (en) | Absolute sentence generation method and device with style diversity | |
Liu et al. | Qaner: Prompting question answering models for few-shot named entity recognition | |
Yoon et al. | Efficient transfer learning schemes for personalized language modeling using recurrent neural network | |
CN108460028A (en) | Sentence weight is incorporated to the field adaptive method of neural machine translation | |
CN107657313B (en) | System and method for transfer learning of natural language processing task based on field adaptation | |
CN113096242A (en) | Virtual anchor generation method and device, electronic equipment and storage medium | |
KR20200116760A (en) | Methods and apparatuses for embedding word considering contextual and morphosyntactic information | |
CN104933038A (en) | Machine translation method and machine translation device | |
CN115906815A (en) | Error correction method and device for modifying one or more types of wrong sentences | |
CN109359308A (en) | Machine translation method, device and readable storage medium storing program for executing | |
Mandal et al. | Futurity of translation algorithms for neural machine translation (NMT) and its vision | |
CN116136870A (en) | Intelligent social conversation method and conversation system based on enhanced entity representation | |
CN114282555A (en) | Translation model training method and device, and translation method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180918 |