CN107273363A - A kind of language text interpretation method and system - Google Patents

A kind of language text interpretation method and system Download PDF

Info

Publication number
CN107273363A
CN107273363A CN201710335652.4A CN201710335652A CN107273363A CN 107273363 A CN107273363 A CN 107273363A CN 201710335652 A CN201710335652 A CN 201710335652A CN 107273363 A CN107273363 A CN 107273363A
Authority
CN
China
Prior art keywords
translation
probability distribution
text
probability
language text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710335652.4A
Other languages
Chinese (zh)
Other versions
CN107273363B (en
Inventor
刘洋
张嘉成
孙茂松
栾焕博
许静芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Beijing Sogou Technology Development Co Ltd
Original Assignee
Tsinghua University
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, Beijing Sogou Technology Development Co Ltd filed Critical Tsinghua University
Priority to CN201710335652.4A priority Critical patent/CN107273363B/en
Publication of CN107273363A publication Critical patent/CN107273363A/en
Application granted granted Critical
Publication of CN107273363B publication Critical patent/CN107273363B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/44Statistical methods, e.g. probability models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)

Abstract

The present invention provides a kind of language text interpretation method and system.This method includes:Rule is determined according to default translation candidate collection, the corresponding translation candidate collection of source language text is determined, the translation candidate collection includes multiple cypher texts of source language text;The source language text is language text to be translated;Based on the translation candidate collection, default translation model and default priori model, the first probability distribution and the second probability distribution are determined;First probability distribution is used to indicate that the cypher text meets the probability of priori model, and second probability distribution is used to indicate that the cypher text meets the probability of translation model;Based on first probability distribution and second probability distribution, the cypher text of the source language text is determined from the translation candidate collection.The present invention can incorporate any priori in translation model, so as to improve the accuracy and reliability of machine translation.

Description

A kind of language text interpretation method and system
Technical field
The present invention relates to machine translation mothod field, more particularly to a kind of language text interpretation method and system.
Background technology
Carried out with international, the exchange between different language crowd is growing day by day, translate into extremely to close in exchanging Important instrument.Machine translation because it is convenient simple and free the advantages of, greatly meet the translation demand of people, improve The efficiency of international exchange so that people propose higher requirement to the correctness of machine translation.
Machine translation can substantially be divided into:Rule-based machine translation method and the machine translation based on corpus.Base In the machine translation of corpus, its key issue, which is that, sets up a complete corpus, alternatively referred to as high-quality Training sample.High-quality training sample directly affects the accuracy of translation.However, setting up high-quality training sample not It is an easy thing, reason is that sample data is limited, it is impossible to portray the distribution of initial data well;In addition, Even if sample data enough, can not avoid wherein the presence of error sample, i.e. noise data.The god obtained based on the training sample It is difficult to prepare to embody master mould through network, or even the situation for violating priori occurs.In this case, priori Introducing just becomes particularly significant.For translation rule, for example, " should not repeat translation, should not also leak and turn over ", such rule is just It can be described as priori.Many studies have shown that, priori is incorporated in neural network model to constrain it, god can be improved Performance through network.
Machine translation method (the Attention-based Neural Machine of neutral net based on notice mechanism Translation;Abbreviation Attention-based NMT) be the machine translation based on corpus a branch, be also current A kind of machine translation method used in main flow translation system.Its basic thought is using a non-linear neural net end to end Source language text is directly mapped to target language text by network, that is, builds the new frame of one " coding-decoding ":Give a source Language sentence, a continuous, dense vector is mapped as first by an encoder, then reuses a decoder The vector is converted into a target language sentence.But, this method is difficult that priori is dissolved among neutral net.
Also there is the technology that priori is dissolved into neutral net by some at present.For example, some technologies are by priori Represented with extra neural network module;Some technologies in training objective by adding limit entry to incorporate priori.Though These right technologies can significantly lift translation effect, but the former correlation that requires between different prioris be also required to by Modeling, the latter is merely able to a small amount of simple limit entry of addition.These problems cause these technologies to be applied to will be any, multiple Miscellaneous priori incorporates neural network machine translation model.
Therefore, how to provide it is a kind of can by any priori incorporate neural network machine translation model interpretation method The problem of being a urgent need to resolve.
The content of the invention
To solve the problem of any priori can not being incorporated into neutral net translation model of prior art presence, this hair It is bright that a kind of language text interpretation method and system are provided.
On the one hand, the present invention provides a kind of language text interpretation method, and this method includes:
Rule is determined according to default translation candidate collection, the corresponding translation candidate collection of source language text is determined, it is described Translation candidate collection includes multiple cypher texts of source language text;The source language text is language text to be translated;
Based on the translation candidate collection, default translation model and default priori model, the first probability is determined Distribution and the second probability distribution;First probability distribution is used to indicate that the cypher text meets the general of priori model Rate, second probability distribution is used to indicate that the cypher text meets the probability of translation model;
Based on first probability distribution and second probability distribution, the source is determined from the translation candidate collection The cypher text of language text.
On the other hand, the present invention provides a kind of language text translation system, and the system includes:
Candidate collection module is translated, for determining rule according to default translation candidate collection, source language text pair is determined The translation candidate collection answered, the translation candidate collection includes multiple cypher texts of source language text;The source language text For language text to be translated;
Training module, for translating candidate collection, default translation model and default priori model based on described, Determine the first probability distribution and the second probability distribution;First probability distribution is used to indicate that the cypher text meets priori and known Know the probability of model, second probability distribution is used to indicate that the cypher text meets the probability of translation model;
Translation module, for based on first probability distribution and second probability distribution, from the translation Candidate Set The cypher text of the source language text is determined in conjunction.
Language text interpretation method and system that the present invention is provided, by calculating priori model and translation model respectively Translation candidate collection on probability distribution, and using the difference of two probability distribution as speech training target a part, from And make it that Machine Translation Model may learn arbitrary priori, improve the accuracy of machine translation result and reliable Property.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of language text interpretation method provided in an embodiment of the present invention;
Fig. 2 is the structural representation of language text translation system provided in an embodiment of the present invention;
Embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, the technical scheme in the embodiment of the present invention is explicitly described, it is clear that described embodiment be the present invention A part of embodiment, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art are not having The every other embodiment obtained under the premise of creative work is made, the scope of protection of the invention is belonged to.
Fig. 1 is the schematic flow sheet of language text interpretation method provided in an embodiment of the present invention.As shown in figure 1, this method Comprise the following steps:
Step 101, according to it is default translation candidate collection determine rule, determine the corresponding translation Candidate Set of source language text Close, the translation candidate collection includes multiple cypher texts of source language text;The source language text is language to be translated Text;
Step 102, based on the translation candidate collection, default translation model and default priori model, it is determined that First probability distribution and the second probability distribution;First probability distribution is used to indicate that the cypher text meets priori mould The probability of type, second probability distribution is used to indicate that the cypher text meets the probability of translation model;
Step 103, based on first probability distribution and second probability distribution, from the translation candidate collection really The cypher text of the fixed source language text.
Specifically, first, default translation candidate collection determine rule refer to translate be the generation of sequence task, source There are multiple words or word in language text x, in generation translation candidate collection, the word or word of previous generation can be as latter The input of individual word or word.According to the source language text x of different length, its size for really translating candidate collection is exponential , it is impossible to effectively calculate.In actual applications, by stochastical sampling or beam search, so as to obtain the source language text Multiple cypher texts, that is, translate candidate collection S (x), be that can be achieved using prior art, here is omitted;
Then, according to the translation candidate collection S (x) and default priori model Q (y | x;γ), the first probability is determined DistributionAccording to the translation candidate collection S (x) and default translation model P (y | x;θ), the second probability distribution is determinedFinally, based on the first probability distribution and the second probability distribution, source language text is determined from translation candidate collection Cypher text y.
For sake of clarity, if source language text x is as input, cypher text y thus constitutes sentence right as output (x, y).In actual applications, under different linguistic context there are different semantemes in same word or word, and source language text x be by Multiple words or word are according to the different compositions that puts in order, and the uncertainty of the ambiguity and order of word or word causes one Individual source language text may correspond to multiple cypher texts (y1, y2, y3 etc.), and probability highest is then in this multiple cypher text Optimal cypher text, in order to be made a distinction with other cypher texts, referred to as target language text.
For example, default priori model Q (y | x;γ), it can be obtained not according to different characteristic function φ (x, y) Same model, the first probability distribution can be determined according to the following formula:
Wherein, x represents source language text, and y is target language text, and y ' is cypher text, and γ is priori model Parameter preset.
Characteristic function φ (x, y) represents the corresponding relation of source language text and cypher text in priori knowledge data base, Based on specific characteristic function, each cypher text y1, y2 and y3 are given a mark using priori model, that is, calculate each Cypher text meets the probability of priori model.Wherein, the cypher text of priori model is more met, probability is higher.
Translation model P (y | x;θ) it is then the commonly used scoring model of machine translation, the translation model can be parallel by training Corpus is obtained, and the corresponding relation of source language text x and cypher text y in Parallel Corpus is represented, for calculating each translation Text meets the probability of translation model, belongs to prior art, and here is omitted.
According to translation candidate collection S (x) and translation model P (y | x;θ), the second probability distribution can be determined by following formula:
Wherein, x represents source language text, and y is target language text, and y ' is cypher text, and θ is the parameter of translation model;α It is the default hyper parameter for controlling the second probability distribution steep.
Language text interpretation method provided in an embodiment of the present invention, by comprehensively utilizing priori model and translation mould Multiple cypher texts are given a mark by type in terms of two, so as to encourage the cypher text for more meeting priori model turning over The probability translated under model is also higher, so that final from translation candidate collection determine target language text, improves translation model Performance and translation result accuracy.
On the basis of above-described embodiment, first probability distribution and described second in the language text interpretation method Probability distribution, determines the cypher text of the source language text from the translation candidate collection, including:
Based on first probability distribution and second probability distribution, probability difference parameter value is determined;The probability difference Different parameter is used for the difference for indicating first probability distribution and second probability distribution;
Based on the probability difference parameter value, the translation text of the source language text is determined from the translation candidate collection This.
Specifically, first, rule is determined according to default translation candidate collection, determines the corresponding translations of source language text x Candidate collection S (x);Then, based on the translation candidate collection, translation model and priori model, the first probability distribution is determinedWith the second probability distributionAfterwards, the probability between the first probability distribution and the second probability distribution is determined Difference parameter value;Finally, based on the probability difference parameter value, turning over for source language text x is determined from translation candidate collection S (x) This y of translation.
For example, after User logs in translation system, in-the input in Chinese column of English translation window in input source language text x For " many airports are all forced to close ", determine that translation candidate collection S (x) there are two cypher texts according to x:Y1 is " Many Airports were closed to close " and y2 are " Many airports were forced to close down”;
According to priori model, the first probability distribution is determined
Wherein, the probability that Q (y1 | x)=0.2, i.e. sentence meets (x, y1) priori model is 0.2;Q (y2 | x)= The probability that 0.8, i.e. sentence meet (x, y2) priori model is 0.8;
According to translation model, the second probability distribution is determined:
Wherein, the probability that P (y1 | x)=0.6, i.e. sentence meets (x, y1) translation model is 0.6;P (y2 | x)=0.4, i.e., The probability that sentence meets (x, y2) translation model is 0.4;
Pass through the first probability distribution and the second probability distribution, it may be determined that difference parameter value therebetween;Based on the difference Different parameter value is adjusted to translation model and above-mentioned two cypher text is given a mark again, obtain P (y1 | x)=0.3, P (y2 | X)=0.7;
Accordingly, it is determined that source language text x:The cypher text y of " many airports are all forced to close ":“Many airports were forced to close down”。
By above-described embodiment it can be seen that, language text interpretation method provided in an embodiment of the present invention, based on the first probability Distribution and the difference parameter value of the second probability distribution, and multiple cypher texts are given a mark again according to translation model, so as to improve Meet probability of the cypher text of priori in translation model probability distribution, and then obtain more accurately source language text Cypher text.
On the basis of above-described embodiment, the difference parameter value of first probability distribution and second probability distribution is KL (Kullback-Leibler) distance, can be determined by following formula:
On the basis of the various embodiments described above, in the language text interpretation method based on the probability difference parameter value, The cypher text of the source language text is determined from the translation candidate collection, including:
Based on the difference parameter value, training objective is determined;The training objective is used to indicate the translation model to institute State priori Model approximation;
Based on the training objective and the default model that reorders, the original language is determined from the translation candidate collection The cypher text of text.
Specifically, first, rule is determined according to default translation candidate collection, determines the corresponding translations of source language text x Candidate collection S (x);Then, based on the translation candidate collection, translation model and priori model, the first probability distribution is determinedWith the second probability distributionAfterwards, the probability between the first probability distribution and the second probability distribution is determined Difference parameter value;Finally, based on the probability difference parameter value, training objective J (θ, γ) is determined so that translation model is to priori mould Type is approached;Finally, based on training objective J (θ, γ) and the default model that reorders, the determination source from translation candidate collection S (x) Language text x cypher text y.
In general, when being given a mark to cypher text, and generally use translation model P (y | x;Log-likelihood θ) is estimated Be counted as standard exercise criterion, i.e., traditional training objective for log-likelihood function L (θ)=logP (y | x;θ).
By determining the difference parameter value of the first probability distribution and the second probability distribution, the difference parameter value is added into tradition In training objective, it is determined that new training objective is J (θ, γ), the training objective thinks that optimal parameter θ and γ can encourage most to accord with Probability highest of the cypher text of priori in the second probability distribution of translation model is closed, so that translation model more inclines The cypher text that priori is determined for compliance with Xiang Yucong translation candidate collection S (x) is source language text x target language text y。
Alternatively, if the difference parameter value is KL distances, training objective can be determined according to following formula:
Wherein, λ1And λ2It is the default hyper parameter of balance training target, N is the sentence logarithm of training data.
Optimal parameter θ and γ is obtained by new training objective, using the following model that reorders, from translation candidate Determine the cypher text of source language text.
Y=argmaxy∈S(x){logP(y|x;θ)+γ·φ(x,y)}
For example, it is assumed that source language text x is " Bush and salon have held talks ", translation candidate collection S is determined according to x (x) there are three cypher texts:Y1 is " Bush held a talk with Sharon ", y2 are " Bush held a talk With Bush ", y3 are " Bush had lunch with Sharon ".
Assuming that characteristic function φ (x, y) represents the word pair occurred in sentence centering source language text x and target language text y Quantity, word is combined into { (Bush, Bush), (holding, held), (talks, talk), (salon, Sharon) } to collection, then the In one cypher text y1,4 words are to occurring, therefore φ (x, y1)=4;Similarly, φ (x, y2)=3, φ (x, y3)= 2。
First probability distribution can be determined according to priori model
Wherein, cypher text y1 probability is:
It can similarly obtain:Q (y2 | x)=e3/(e2+e3+e4);Q (y3 | x)=e2/(e2+e3+e4).Final Q (y1 | x)= 0.67, Q (y2 | x)=0.24, Q (y3 | x)=0.09.
By above-mentioned probability, cypher text y1 best suits priori model, and is in fact also correctly to turn over Translation sheet;Cypher text y2 has then substantially run counter to the priori of " should not repeat translation, should not leak and turn over ", therefore probability is relatively low; Cypher text y3 then deviate from the semanteme of source language text, therefore probability is lower.
Assuming that obtaining the second probability distribution by the translation model before adjustment
Wherein, P (y1 | x)=0.4, P (y2 | x)=0.5, P (y3 | x)=0.1, translation model can translate " Bush held a talk with Bush”。
Now, if default hyper parameter λ1、λ2Numerical value be 1, pass through formula calculate above-mentioned two probability distribution between KL (P | | Q), new training objective J (θ, γ) is determined based on KL distances;
Based on the training objective and reorder model, translation model is adjusted, P (y1 | x)=0.6 after training, P (y2 | x)=0.31, P (y3 | x)=0.09, it is seen then that new training objective improves cypher text y1 probability, and reduces Cypher text y2 and y3 probability so that more meet probability in probability distribution of the cypher text of priori in translation model It is higher, even if translation model is to priori Model approximation.
Therefore, the target language text y of final output is " Bush held a talk with Sharon ".
By above-described embodiment it can be seen that, language text interpretation method provided in an embodiment of the present invention, by the way that elder generation will be met Test the probability distribution of knowledge model and meet translation model probability distribution between KL distances add traditional training objective, drum Encourage probability of the cypher text for more meeting priori model under translation model also higher, and then the translation more optimized Model parameter, so that final from translation candidate collection determine target language text, improves performance and the translation of translation model As a result accuracy.
Fig. 2 is the structural representation of language text translation system provided in an embodiment of the present invention.As shown in Fig. 2 the system Including:Translate candidate collection module 21, training module 22 and translation module 23.Wherein, translation candidate collection module 21 is used for root Rule is determined according to default translation candidate collection, the corresponding translation candidate collection of source language text, the translation Candidate Set is determined Conjunction includes multiple cypher texts of source language text;The source language text is language text to be translated;Training module 22 is used In based on the translation candidate collection, default translation model and default priori model, determine the first probability distribution and Second probability distribution;First probability distribution is used to indicate that the cypher text meets the probability of priori model, described Second probability distribution is used to indicate that the cypher text meets the probability of translation model;Translation module 23 is used to be based on described first Probability distribution and second probability distribution, determine the cypher text of the source language text from the translation candidate collection.
It should be noted that the language text translation system is that, in order to realize above method embodiment, its function is specific Above method embodiment is referred to, here is omitted.
On the basis of above-described embodiment, the translation module 23 in the system is specifically for based on first probability distribution And second probability distribution, determine probability difference parameter value;The probability difference parameter is used to indicate first probability point The difference of cloth and second probability distribution;Based on the probability difference parameter value, institute is determined from the translation candidate collection State the cypher text of source language text.Alternatively, the probability difference parameter is KL distances.
On the basis of the various embodiments described above, the translation module 23 in the system is specifically for based on the difference parameter Value, determines training objective;The training objective is used to indicate the translation model to the priori Model approximation;Based on institute Training objective and the default model that reorders are stated, the translation text of the source language text is determined from the translation candidate collection This.
The language text interpretation method and system provided by the present invention, translation is dissolved into the training stage by priori In model, the performance of translation model is improved, and then priori is applied in translation process, extra without increase Mixed-media network modules mixed-media, which is achieved that, applies to any priori in machine translation, the final accuracy for improving translation result and reliable Property.
Finally it should be noted that:The above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although The present invention is described in detail with reference to the foregoing embodiments, it will be understood by those within the art that:It still may be used To be modified to the technical scheme described in foregoing embodiments, or equivalent substitution is carried out to which part technical characteristic; And these modification or replace, do not make appropriate technical solution essence depart from various embodiments of the present invention technical scheme spirit and Scope.

Claims (8)

1. a kind of language text interpretation method, it is characterised in that including:
Rule is determined according to default translation candidate collection, the corresponding translation candidate collection of source language text, the translation is determined Candidate collection includes multiple cypher texts of source language text;The source language text is language text to be translated;
Based on the translation candidate collection, default translation model and default priori model, the first probability distribution is determined And second probability distribution;First probability distribution is used to indicate that the cypher text meets the probability of priori model, institute Stating the second probability distribution is used to indicate that the cypher text meets the probability of translation model;
Based on first probability distribution and second probability distribution, the original language is determined from the translation candidate collection The cypher text of text.
2. according to the method described in claim 1, it is characterised in that described based on first probability distribution and described second general Rate is distributed, and the cypher text of the source language text is determined from the translation candidate collection, including:
Based on first probability distribution and second probability distribution, probability difference parameter value is determined;The probability difference ginseng Number is used for the difference for indicating first probability distribution and second probability distribution;
Based on the probability difference parameter value, the cypher text of the source language text is determined from the translation candidate collection.
3. method according to claim 2, it is characterised in that the probability difference parameter is KL distances.
4. method according to claim 2, it is characterised in that based on the probability difference parameter value, waited from the translation Selected works determine the cypher text of the source language text in closing, including:
Based on the difference parameter value, training objective is determined;The training objective is used to indicate the translation model to the elder generation Knowledge model is tested to approach;
Based on the training objective and the default model that reorders, the source language text is determined from the translation candidate collection Cypher text.
5. a kind of language text translation system, it is characterised in that including:
Candidate collection module is translated, for determining rule according to default translation candidate collection, determines that source language text is corresponding Candidate collection is translated, the translation candidate collection includes multiple cypher texts of source language text;The source language text is to treat The language text of translation;
Training module, for translating candidate collection, default translation model and default priori model based on described, it is determined that First probability distribution and the second probability distribution;First probability distribution is used to indicate that the cypher text meets priori mould The probability of type, second probability distribution is used to indicate that the cypher text meets the probability of translation model;
Translation module, for based on first probability distribution and second probability distribution, from the translation candidate collection Determine the cypher text of the source language text.
6. system according to claim 5, it is characterised in that the translation module specifically for:
Based on first probability distribution and second probability distribution, probability difference parameter value is determined;The probability difference ginseng Number is used for the difference for indicating first probability distribution and second probability distribution;
Based on the probability difference parameter value, the cypher text of the source language text is determined from the translation candidate collection.
7. system according to claim 6, it is characterised in that the probability difference parameter is KL distances.
8. system according to claim 6, it is characterised in that the translation module specifically for:
Based on the difference parameter value, training objective is determined;The training objective is used to indicate the translation model to the elder generation Knowledge model is tested to approach;
Based on the training objective and the default model that reorders, the source language text is determined from the translation candidate collection Cypher text.
CN201710335652.4A 2017-05-12 2017-05-12 A kind of language text interpretation method and system Active CN107273363B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710335652.4A CN107273363B (en) 2017-05-12 2017-05-12 A kind of language text interpretation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710335652.4A CN107273363B (en) 2017-05-12 2017-05-12 A kind of language text interpretation method and system

Publications (2)

Publication Number Publication Date
CN107273363A true CN107273363A (en) 2017-10-20
CN107273363B CN107273363B (en) 2019-11-22

Family

ID=60074224

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710335652.4A Active CN107273363B (en) 2017-05-12 2017-05-12 A kind of language text interpretation method and system

Country Status (1)

Country Link
CN (1) CN107273363B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783824A (en) * 2018-12-17 2019-05-21 北京百度网讯科技有限公司 Interpretation method, device and storage medium based on translation model
CN110298045A (en) * 2019-05-31 2019-10-01 北京百度网讯科技有限公司 Machine translation method, device, equipment and storage medium
CN110334359A (en) * 2019-06-05 2019-10-15 华为技术有限公司 Text interpretation method and device
CN111178085A (en) * 2019-12-12 2020-05-19 科大讯飞(苏州)科技有限公司 Text translator training method, and professional field text semantic parsing method and device
CN111368091A (en) * 2020-02-13 2020-07-03 中国工商银行股份有限公司 Document translation method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646019A (en) * 2013-12-31 2014-03-19 哈尔滨理工大学 Method and device for fusing multiple machine translation systems
CN103678285A (en) * 2012-08-31 2014-03-26 富士通株式会社 Machine translation method and machine translation system
US20150248400A1 (en) * 2014-02-28 2015-09-03 Ebay Inc. Automatic extraction of multilingual dictionary items from non-parallel, multilingual, semi-strucutred data
CN105573994A (en) * 2016-01-26 2016-05-11 沈阳雅译网络技术有限公司 Statistic machine translation system based on syntax framework

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678285A (en) * 2012-08-31 2014-03-26 富士通株式会社 Machine translation method and machine translation system
CN103646019A (en) * 2013-12-31 2014-03-19 哈尔滨理工大学 Method and device for fusing multiple machine translation systems
US20150248400A1 (en) * 2014-02-28 2015-09-03 Ebay Inc. Automatic extraction of multilingual dictionary items from non-parallel, multilingual, semi-strucutred data
CN105573994A (en) * 2016-01-26 2016-05-11 沈阳雅译网络技术有限公司 Statistic machine translation system based on syntax framework

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHEN SHI等: "Knowledge-Based Semantic Embedding for Machine Translation", 《PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS》 *
郭俊博 等: "N-Best句法知识增强的统计机器翻译预调序模型", 《计算机工程与应用》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783824A (en) * 2018-12-17 2019-05-21 北京百度网讯科技有限公司 Interpretation method, device and storage medium based on translation model
CN109783824B (en) * 2018-12-17 2023-04-18 北京百度网讯科技有限公司 Translation method, device and storage medium based on translation model
CN110298045A (en) * 2019-05-31 2019-10-01 北京百度网讯科技有限公司 Machine translation method, device, equipment and storage medium
CN110298045B (en) * 2019-05-31 2023-03-24 北京百度网讯科技有限公司 Machine translation method, device, equipment and storage medium
CN110334359A (en) * 2019-06-05 2019-10-15 华为技术有限公司 Text interpretation method and device
CN111178085A (en) * 2019-12-12 2020-05-19 科大讯飞(苏州)科技有限公司 Text translator training method, and professional field text semantic parsing method and device
CN111368091A (en) * 2020-02-13 2020-07-03 中国工商银行股份有限公司 Document translation method and device
CN111368091B (en) * 2020-02-13 2023-09-22 中国工商银行股份有限公司 Document translation method and device

Also Published As

Publication number Publication date
CN107273363B (en) 2019-11-22

Similar Documents

Publication Publication Date Title
US11481562B2 (en) Method and apparatus for evaluating translation quality
CN107273363B (en) A kind of language text interpretation method and system
US20200410396A1 (en) Implicit bridging of machine learning tasks
CN107967262B (en) A kind of neural network illiteracy Chinese machine translation method
Dowling et al. SMT versus NMT: Preliminary comparisons for Irish
CN109325229B (en) Method for calculating text similarity by utilizing semantic information
CN102789451A (en) Individualized machine translation system, method and translation model training method
Li et al. Improving text normalization using character-blocks based models and system combination
Li et al. Chinese grammatical error correction based on convolutional sequence to sequence model
CN113901208A (en) Method for analyzing emotion tendentiousness of intermediate-crossing language comments blended with theme characteristics
Lohar et al. A systematic comparison between SMT and NMT on translating user-generated content
WO2019218809A1 (en) Chapter-level text translation method and device
Tran et al. Hierarchical transformer encoders for Vietnamese spelling correction
CN109446535A (en) A kind of illiteracy Chinese nerve machine translation method based on triangle framework
Iranzo-Sánchez et al. From simultaneous to streaming machine translation by leveraging streaming history
JP2023002730A (en) Text error correction and text error correction model generating method, device, equipment, and medium
Liu et al. A novel domain adaption approach for neural machine translation
WO2022227196A1 (en) Data analysis method and apparatus, computer device, and storage medium
CN114580446A (en) Neural machine translation method and device based on document context
Xia et al. Generating Questions Based on Semi-Automated and End-to-End Neural Network.
CN110147556B (en) Construction method of multidirectional neural network translation system
Azpiazu et al. A framework for hierarchical multilingual machine translation
Singvongsa et al. Lao-Thai machine translation using statistical model
CN117034968B (en) Neural machine translation method, device, electronic equipment and medium
Amin et al. Marathi-english code-mixed text generation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant