CN104199813B - Pseudo-feedback-based personalized machine translation system and method - Google Patents

Pseudo-feedback-based personalized machine translation system and method Download PDF

Info

Publication number
CN104199813B
CN104199813B CN201410491100.9A CN201410491100A CN104199813B CN 104199813 B CN104199813 B CN 104199813B CN 201410491100 A CN201410491100 A CN 201410491100A CN 104199813 B CN104199813 B CN 104199813B
Authority
CN
China
Prior art keywords
translation
module
task
post
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410491100.9A
Other languages
Chinese (zh)
Other versions
CN104199813A (en
Inventor
杨沐昀
朱俊国
赵铁军
李生
徐冰
曹海龙
朱聪慧
郑德权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin University Of Technology High Tech Development Corp
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN201410491100.9A priority Critical patent/CN104199813B/en
Publication of CN104199813A publication Critical patent/CN104199813A/en
Application granted granted Critical
Publication of CN104199813B publication Critical patent/CN104199813B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The invention relates to a pseudo-feedback-based personalized machine translation system and method. The existing traditional machine translation methods are unavailable for the obtaining of high-quality personalized translation systems, and the various translation demands of users cannot be met. The pseudo-feedback-based personalized machine translation system comprises a phrase table filter module, an input module, an initial translation module, a pseudo-feedback search module, a phrase table sorting module and a decoder module. The pseudo-feedback-based personalized machine translation method includes the steps: an inputting step, a user inputs a translation task S; an initial translation step, an initial machine translation result T' of the translation task is obtained with the initial translation module; a pseudo-feedback search step, the pseudo-feedback search module is used to search to obtain initial translation results and standard translations R of similar translation instances; a phrase table sorting step, a trained universal post-editing model is turned into a personalized post-editing model, and filtering is performed to obtain an optimized post-editing model; a decoder module decoding step, the optimized personalized post-editing model is used to decode the initial machine translation result T' of the translation task so as to obtain an optimal final translation result. The pseudo-feedback-based personalized machine translation system and method is applicable to the field of machine translation.

Description

Personalized machine translation system and method based on pseudo feedback
Technical Field
The invention relates to a personalized machine translation system and a personalized machine translation method, and belongs to the field of machine translation.
Background
With the rapid development of machine translation technology in recent years, the translation quality has been greatly improved, and some general online translation services can help people break through the language barrier to read and understand some common cross-language texts. However, significant difficulties have been encountered to further improve the quality of machine translation. On the one hand, the main disadvantage of the existing statistical machine translation technology is that if personalized translation is to be completed, a large amount of user feedback information is needed, and statistical training modeling is performed on the data, so that a personalized machine translation model is realized. The user feedback information required by the training is difficult to obtain, and the existing method cannot effectively utilize the feedback information, so that a high-quality personalized translation system cannot be obtained. Although user feedback information can be utilized through conventional post-editing, the advantages of the statistical post-editing model are difficult to exploit due to the small amount of user data that can be used. On the other hand, the optimization goals of conventional machine translation methods are typically based on open-field rather than on specific translation tasks. Despite research on the problem of domain adaptation, the method still belongs to a professional group, and various translation requirements of users cannot be met by the wide and diverse machine translation users, especially internet online users. Therefore, further improving the quality of machine translation is a technical problem to be solved urgently.
Disclosure of Invention
The invention aims to solve the problem that a high-quality personalized translation system cannot be obtained by a traditional machine translation method, and various translation requirements of a user cannot be met, and provides a personalized machine translation system and a translation method based on pseudo feedback, which can improve the machine translation quality.
A personalized machine translation system based on pseudo-feedback, the translation system comprising:
a phrase table filtering module for filtering each generic post-editing model phrase table of the development set data;
the input module is used for obtaining a translation task S input by a user;
the preliminary translation module is used for translating the translation task S input by the user to obtain a preliminary machine translation result T' of the translation task, and translating the source language sentences of the translation example base provided by the local system to obtain a preliminary translation sentence T of the translation example;
the pseudo feedback retrieval module is used for retrieving and obtaining a preliminary translation result of a similar translation example and a standard translation R in a translation example library of a local system in a word alignment form;
the phrase table classification module is used for classifying the phrase table of the trained post-editing model to obtain an individualized post-editing model;
and the decoder module is used for decoding the preliminary translation result of the similar translation example retrieved by the pseudo feedback retrieval module to obtain a final translation result.
Before a user inputs a translation task S, a general post-editing model is trained by using a statistical method by using a translation example preliminary translation sentence T and a standard translation sentence R in translation memory, and the training process of the general post-editing model is completed; the personalized machine translation method is realized by the following steps:
step one, phrase table filtering module process: filtering each universal post-editing model phrase table of the development set data by using a phrase table filtering module;
employing default weights for each sentence D in the development set data based on the filtered resultsiDecoding is carried out to generate an n-best translation result; then, combining the n-best translation results; finally, the MERT tool is used for integrally adjusting parameters of the combined n-best translation result, and the characteristic parameter optimization process can be realized;
step two, an input process: inputting a translation task S into an input module by a user;
step three, a primary translation process: the preliminary translation process comprises two parts, namely before the user inputs the translation task S and after the user inputs the translation task S;
before a user inputs a translation task S, a translation platform set up by a machine translation system of a local system is utilized to initially translate a source language sentence of a translation example library provided by the local system to obtain an initial translation sentence T of a translation example;
meanwhile, after a translation task S input by a user is obtained through an input module, a primary machine translation result T' of the translation task is obtained through translation of a primary translation module;
step four, pseudo feedback retrieval process: according to the translation example preliminary translation sentence T obtained in the third step, in a translation example library in a local word alignment form, a pseudo feedback retrieval module is utilized to perform cosine similarity retrieval through a source language word bag model to obtain a preliminary translation result and a standard translation text R of a similar translation example, and the first 900 most similar words are selected from the preliminary translation result of the similar translation example and the retrieval result of the standard translation text R;
the cosine similarity CS is calculated according to a vector space model taking a source language bag-of-words model as a unit, and the calculation method of the cosine similarity CS comprises the following steps:
wherein Vec (S)example) Source language sentence vector, Vec (S), which is a translation instanceinput) To translate a task vector, Vec (S)input)·Vec(Sexample) Is the inner product of two vectors, | | · | |, is the norm of the vector;
step five, the phrase table classification process: according to the initial translation result and the standard translation text R of the first 900-1100 similar translation examples selected in the fourth step, the phrase table of the trained general post-editing model is classified into positive phrases which are beneficial to improving the translation quality and negative phrases which are capable of integrating noise into the final translation result by using a phrase table classification module, so that the trained general post-editing model is changed into a personalized post-editing model, the positive phrases and the negative phrases in the personalized post-editing model are compared with the initial translation result and the standard translation text R of the similar translation examples retrieved in the pseudo feedback retrieval process in the fourth step, and the phrases are filtered from the personalized post-editing model phrase table, so that an optimized personalized post-editing model is obtained;
step six, the decoding process of the decoder module: and taking the optimized personalized editing model in the step five as a translation model, and decoding the primary machine translation result T' of the translation task obtained in the step three by using a traditional machine translation decoding method by using a decoder module to obtain an optimized final translation result.
The invention has the beneficial effects that: the invention utilizes the pseudo feedback retrieval module to retrieve similar translation examples in the translation example library, classifies general post-editing phrases through the short language table classification module, filters out negative post-editing phrases, and selects post-editing rules to obtain an optimized personalized post-editing model, thereby improving the quality of machine translation. In addition, the characteristic parameter optimization process is applied when the model is edited after being built in the primary translation process, and in the characteristic parameter optimization process, input data are decoded respectively for given development set data, and then overall parameter adjustment is carried out, so that the method has the advantages of effectively optimizing parameters and improving system performance. Particularly, in the process of utilizing the pseudo feedback retrieval module to retrieve in the local translation instance library data set, a parallel statement pair similar to the initial translation result of the sentence to be translated input and obtained by the user is obtained to replace feedback information, so that the problem that the feedback information of the user is difficult to obtain is solved.
In addition, the feedback information is well utilized by the method, an effective post-editing model is established on the initial translation model, and the translation result obtained by the personalized machine translation system and method based on the pseudo feedback is compared with the translation result of Google, so that the translation quality is improved by 19.5 percent; compared with the translation result of a machine translation system trained by a Moses tool, the translation quality is improved by 14.1 percent
Drawings
FIG. 1 is a schematic diagram of the translation process of the present invention.
Detailed Description
The first embodiment is as follows:
the personalized machine translation system based on the pseudo feedback of the embodiment comprises:
a phrase table filtering module for filtering each generic post-editing model phrase table of the development set data;
the input module is used for obtaining a translation task S input by a user;
the preliminary translation module is used for translating the translation task S input by the user to obtain a preliminary machine translation result T' of the translation task, and translating the source language sentences of the translation example base provided by the local system to obtain a preliminary translation sentence T of the translation example;
the pseudo feedback retrieval module is used for retrieving and obtaining a preliminary translation result of a similar translation example and a standard translation R in a translation example library of a local system in a word alignment form;
the phrase table classification module is used for classifying the phrase table of the trained post-editing model to obtain an individualized post-editing model;
and the decoder module is used for decoding the preliminary translation result of the similar translation example retrieved by the pseudo feedback retrieval module to obtain a final translation result.
The second embodiment is as follows:
different from the specific embodiment, in the personalized machine translation system based on pseudo feedback according to the embodiment, the phrase table filtering module is included in the phrase table classifying module.
The third concrete implementation mode:
in the translation method of the personalized machine translation system based on the pseudo feedback, before a user inputs a translation task S, a general post-editing model is trained by using a statistical method by using a translation example preliminary translation sentence T and a standard translation sentence R in translation memory, and the training process of the general post-editing model is completed; the personalized machine translation method is realized by the following steps:
step one, phrase table filtering module process: filtering each universal post-editing model phrase table of the development set data by using a phrase table filtering module;
employing default weights for each sentence D in the development set data based on the filtered resultsiDecoding is carried out to generate an n-best translation result; then, combining the n-best translation results; finally, the MERT tool is used for integrally adjusting parameters of the combined n-best translation result, and the characteristic parameter optimization process can be realized;
step two, an input process: inputting a translation task S into an input module by a user;
step three, a primary translation process: the preliminary translation process comprises two parts, namely before the user inputs the translation task S and after the user inputs the translation task S;
before a user inputs a translation task S, a translation platform set up by a machine translation system of a local system is utilized to initially translate a source language sentence of a translation example library provided by the local system to obtain an initial translation sentence T of a translation example;
meanwhile, after a translation task S input by a user is obtained through an input module, a primary machine translation result T' of the translation task is obtained through translation of a primary translation module;
step four, pseudo feedback retrieval process: according to the translation example preliminary translation sentence T obtained in the third step, in a translation example library in a local word alignment form, a pseudo feedback retrieval module is utilized to perform cosine similarity retrieval through a source language word bag model to obtain a preliminary translation result and a standard translation text R of a similar translation example, and the first 900 most similar words are selected from the preliminary translation result of the similar translation example and the retrieval result of the standard translation text R;
the cosine similarity CS is calculated according to a vector space model taking a source language bag-of-words model as a unit, and the calculation method of the cosine similarity CS comprises the following steps:
wherein Vec (S)example) Source language sentence vector, Vec (S), which is a translation instanceinput) To translate a task vector, Vec (S)input)·Vec(Sexample) Is the inner product of two vectors, | | · | |, is the norm of the vector;
step five, the phrase table classification process: according to the initial translation result and the standard translation text R of the first 900-1100 similar translation examples selected in the fourth step, the phrase table of the trained general post-editing model is classified into positive phrases which are beneficial to improving the translation quality and negative phrases which are capable of integrating noise into the final translation result by using a phrase table classification module, so that the trained general post-editing model is changed into a personalized post-editing model, the positive phrases and the negative phrases in the personalized post-editing model are compared with the initial translation result and the standard translation text R of the similar translation examples retrieved in the pseudo feedback retrieval process in the fourth step, and the phrases are filtered from the personalized post-editing model phrase table, so that an optimized personalized post-editing model is obtained;
step six, the decoding process of the decoder module: and taking the optimized personalized editing model in the step five as a translation model, and decoding the primary machine translation result T' of the translation task obtained in the step three by using a traditional machine translation decoding method by using a decoder module to obtain an optimized final translation result.
The fourth concrete implementation mode:
different from the third embodiment, in the translation method of the personalized machine system based on the pseudo feedback according to the third embodiment, the decoding process in the sixth step uses a formula:processing the primary machine translation result T' of the translation task to obtain an optimized final translation result; in the formula, P (T ″, T ') is a translation probability of the general post-editing model, P (S | T ″, T ') is a probability of post-editing model translation of a preliminary machine translation sentence T ' of a given input translation task S by a phrase pair (T ″, T ') in the general post-editing model, and a probability value thereof is defined as 1 or 0, and then a value of P (S | T ″, T ') is obtained by the following two methods:
1) editing phrase pairs (P) in a model upon optimization of personalizationT,PR) When the two phrases in the translation task are respectively matched with at least one phrase in the initial machine translation result T ' and the standard translation text R of the translation task, the probability value of P (S | T ', T ') is 1, otherwise, 0 is taken; or,
2) editing phrase pairs (P) in a model upon optimization of personalizationT,PR) The phrase P in (1)RWhen there is a match with at least one phrase in the standard translation R, the probability value of P (S | T ', T') is taken as 1, otherwise it is taken as 0.
The fifth concrete implementation mode:
different from the third or fourth specific embodiments, in the translation method of the personalized machine system based on the pseudo feedback according to the present embodiment, when the pseudo feedback retrieval process according to the fourth step is performed, the top 1000 most similar translation results are selected from the preliminary translation result of the similar translation example and the retrieval result of the standard translation R.
The Olympic of IWSLT2012 is used as a translation task input by a user, the translation task data is used for testing the personalized machine translation system and method based on the pseudo feedback, the training data provided by the translation task input by the user is in the field of travel spoken language, the specific application occasions of traffic, catering, stadiums, commerce and the like under the application background of Olympic games are covered, 52,603 pairs of Chinese-English double-language sentence pairs, specifically 495,638 Chinese words and 527,599 English words are contained, and the translation task data is used as a personalized local translation example library of the user. Adopting a development set comprising 2,057 pairs of Chinese-English bilingual sentence pairs and a test set comprising 998 pairs of Chinese-English bilingual sentence pairs; the preliminary translation module uses a google online translation system, the translation result of the linguistic data is crawled from the google online translation system, the BLEU-4 is adopted as the translation quality evaluation standard, and the obtained test result is directly compared with the google translation result. Meanwhile, a machine translation system trained by using an open source Moses tool is used as a second group of control tests for comparison.
By taking the BLEU-4 score as an evaluation standard, the translation result obtained by the personalized machine translation system and method based on the pseudo feedback is compared with the translation result of the Google online translation system, and the translation quality is improved by 19.5 percent; compared with the translation result of a machine translation system trained by a Moses tool, the translation quality is improved by 14.1%, and the test result is shown in Table 1:
table 1: and comparing the translation quality of the personalized translation result based on the pseudo feedback with that of the translation result of other systems.

Claims (5)

1. A personalized machine translation system based on pseudo feedback, the translation system comprising:
a phrase table filtering module for filtering each generic post-editing model phrase table of the development set data;
the input module is used for obtaining a translation task S input by a user;
the preliminary translation module is used for translating the translation task S input by the user to obtain a preliminary machine translation result T' of the translation task, and translating the source language sentences of the translation example base provided by the local system to obtain a preliminary translation sentence T of the translation example;
the pseudo feedback retrieval module is used for retrieving and obtaining a preliminary translation result of a similar translation example and a standard translation R in a translation example library of a local system in a word alignment form;
the phrase table classification module is used for classifying the phrase table of the trained post-editing model to obtain an individualized post-editing model;
and the decoder module is used for decoding the preliminary translation result of the similar translation example retrieved by the pseudo feedback retrieval module to obtain a final translation result.
2. The personalized machine translation system based on pseudo feedback of claim 1, wherein the phrase table filtering module is included in the phrase table classification module.
3. The translation method of the personalized machine translation system based on the pseudo feedback as claimed in claim 2, wherein: before a user inputs a translation task S, training a general post-editing model by using a translation example preliminary translation sentence T and a standard translation sentence R in translation memory and adopting a statistical method, and finishing the training process of the general post-editing model; the personalized machine translation method is realized by the following steps:
step one, phrase table filtering process: filtering each universal post-editing model phrase table of the development set data by using a phrase table filtering module;
employing default weights for each sentence D in the development set data based on the filtered resultsiDecoding is carried out to generate an n-best translation result; then, combining the n-best translation results; finally, the MERT tool is used for integrally adjusting parameters of the combined n-best translation result, and the characteristic parameter optimization process can be realized;
step two, an input process: inputting a translation task S into an input module by a user;
step three, a primary translation process: the preliminary translation process comprises two parts, namely before the user inputs the translation task S and after the user inputs the translation task S;
before a user inputs a translation task S, a translation platform set up by a machine translation system of a local system is utilized to initially translate a source language sentence of a translation example library provided by the local system to obtain an initial translation sentence T of a translation example;
meanwhile, after a translation task S input by a user is obtained through an input module, a primary machine translation result T' of the translation task is obtained through translation of a primary translation module;
step four, pseudo feedback retrieval process: according to the translation example preliminary translation sentence T obtained in the third step, in a translation example library in a local word alignment form, a pseudo feedback retrieval module is utilized to perform cosine similarity retrieval through a source language word bag model to obtain a preliminary translation result and a standard translation text R of a similar translation example, and the first 900 most similar words are selected from the preliminary translation result of the similar translation example and the retrieval result of the standard translation text R;
the cosine similarity CS is calculated according to a vector space model taking a source language bag-of-words model as a unit, and the calculation method of the cosine similarity CS comprises the following steps:
C S ( S i n p u t , S e x a m p l e ) = V e c ( S i n p u t ) · V e c ( S e x a m p l e ) | | V e c ( S i n p u t ) | | * | | V e c ( S e x a m p l e ) | | ,
wherein Vec (S)example) Source language sentence vector, Vec (S), which is a translation instanceinput) To translate a task vector, Vec (S)input)·Vec(Sexample) Is the inner product of two vectors, | | · | |, is the norm of the vector;
step five, the phrase table classification process: according to the initial translation result and the standard translation text R of the first 900-1100 similar translation examples selected in the fourth step, the phrase table of the trained general post-editing model is classified into positive phrases which are beneficial to improving the translation quality and negative phrases which are capable of integrating noise into the final translation result by using a phrase table classification module, so that the trained general post-editing model is changed into a personalized post-editing model, the positive phrases and the negative phrases in the personalized post-editing model are compared with the initial translation result and the standard translation text R of the similar translation examples retrieved in the pseudo feedback retrieval process in the fourth step, and the phrases are filtered from the personalized post-editing model phrase table, so that an optimized personalized post-editing model is obtained;
step six, the decoding process of the decoder module: and taking the optimized personalized editing model in the step five as a translation model, and decoding the primary machine translation result T' of the translation task obtained in the step three by using a traditional machine translation decoding method by using a decoder module to obtain an optimized final translation result.
4. The translation method of the personalized machine translation system based on the pseudo feedback according to claim 3, wherein: step six the decoding process utilizes the formula:processing the primary machine translation result T' of the translation task to obtain an optimized final translation result; in the formula, P (T ″, T ') is a translation probability of the general post-editing model, P (S | T ″, T ') is a probability of post-editing model translation of a preliminary machine translation sentence T ' of a given input translation task S by a phrase pair (T ″, T ') in the general post-editing model, and a probability value thereof is defined as 1 or 0, and then a value of P (S | T ″, T ') is obtained by the following two methods:
1) editing phrase pairs (P) in a model upon optimization of personalizationT,PR) When the two phrases in the translation task are respectively matched with at least one phrase in the initial machine translation result T ' and the standard translation text R of the translation task, the probability value of P (S | T ', T ') is 1, otherwise, 0 is taken; or,
2) editing phrase pairs (P) in a model upon optimization of personalizationT,PR) The phrase P in (1)RAt least one of the standard translation RWhen phrases match, the probability value of P (S | T ', T') is 1, otherwise it is 0.
5. The translation method of the personalized machine translation system based on the pseudo feedback according to claim 3 or 4, wherein: and when the pseudo feedback retrieval process in the step four is carried out, the top 1000 most similar ones are selected from the preliminary translation results of the similar translation examples and the retrieval results of the standard translation R.
CN201410491100.9A 2014-09-24 2014-09-24 Pseudo-feedback-based personalized machine translation system and method Active CN104199813B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410491100.9A CN104199813B (en) 2014-09-24 2014-09-24 Pseudo-feedback-based personalized machine translation system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410491100.9A CN104199813B (en) 2014-09-24 2014-09-24 Pseudo-feedback-based personalized machine translation system and method

Publications (2)

Publication Number Publication Date
CN104199813A CN104199813A (en) 2014-12-10
CN104199813B true CN104199813B (en) 2017-05-24

Family

ID=52085108

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410491100.9A Active CN104199813B (en) 2014-09-24 2014-09-24 Pseudo-feedback-based personalized machine translation system and method

Country Status (1)

Country Link
CN (1) CN104199813B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107301173B (en) * 2017-06-22 2019-10-25 北京理工大学 A kind of automatic post-editing system and method for multi-source neural network remixing mode based on splicing
JP6976447B2 (en) * 2018-08-24 2021-12-08 株式会社Nttドコモ Machine translation controller
US20210034824A1 (en) * 2018-08-24 2021-02-04 Ntt Docomo, Inc. Machine translation control device
CN112446222A (en) * 2019-08-16 2021-03-05 阿里巴巴集团控股有限公司 Translation optimization method and device and processor
CN111274827B (en) * 2020-01-20 2021-05-28 南京新一代人工智能研究院有限公司 Suffix translation method based on multi-target learning of word bag
CN113807106B (en) * 2021-08-31 2023-03-07 北京百度网讯科技有限公司 Translation model training method and device, electronic equipment and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793368A (en) * 2012-10-31 2014-05-14 上海勇金懿信息科技有限公司 Method for automatically protecting marks in marking language in automatic translation processing

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8494835B2 (en) * 2008-12-02 2013-07-23 Electronics And Telecommunications Research Institute Post-editing apparatus and method for correcting translation errors

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793368A (en) * 2012-10-31 2014-05-14 上海勇金懿信息科技有限公司 Method for automatically protecting marks in marking language in automatic translation processing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Step towards Human-Machine Unification Using Translation Memory and Machine Translation System;Nishtha Jaiswal等;《2011 International Conference on Languages, Literature and Linguistics》;20111228;第26卷;第64-68页 *
一种智能译后编辑器的设计及其实现算法;黄河燕等;《软件学报》;19950305;第6卷(第3期);第129-135页 *

Also Published As

Publication number Publication date
CN104199813A (en) 2014-12-10

Similar Documents

Publication Publication Date Title
CN104199813B (en) Pseudo-feedback-based personalized machine translation system and method
Cussens Part-of-speech tagging using Progol
Li et al. Text compression-aided transformer encoding
CN102955772B (en) A kind of similarity calculating method based on semanteme and device
Wang et al. Automatic construction of discourse corpora for dialogue translation
Amin et al. CMS-Intelligent machine translation with adaptation and AI
Jabbari et al. Developing an open-domain English-Farsi translation system using AFEC: Amirkabir bilingual Farsi-English corpus
Tien et al. Long sentence preprocessing in neural machine translation
Ostapenko et al. Speaker information can guide models to better inductive biases: A case study on predicting code-switching
Ramanarayanan et al. Automatic turn-level language identification for code-switched spanish–english dialog
Parikh et al. IRLab_DAIICT at SemEval-2020 task 9: Machine learning and deep learning methods for sentiment analysis of code-mixed tweets
CN112949293A (en) Similar text generation method, similar text generation device and intelligent equipment
Berrichi et al. Benefits of morphosyntactic features on English-Arabic statistical machine translation
CN106776590A (en) A kind of method and system for obtaining entry translation
Leidig et al. Automatic detection of anglicisms for the pronunciation dictionary generation: a case study on our German IT corpus.
Sreeram et al. A Novel Approach for Effective Recognition of the Code-Switched Data on Monolingual Language Model.
JP5298834B2 (en) Example sentence matching translation apparatus, program, and phrase translation apparatus including the translation apparatus
Fernández et al. Identifying relevant phrases to summarize decisions in spoken meetings.
Tao et al. Improving matching models with hierarchical contextualized representations for multi-turn response selection
Tran et al. A classifier-based preordering approach for english-vietnamese statistical machine translation
CN110610001A (en) Short text integrity identification method and device, storage medium and computer equipment
Viet et al. Dependency-based pre-ordering for English-Vietnamese statistical machine translation
Tran et al. A reordering model for Vietnamese-English statistical machine translation using dependency information
Liu et al. A dependency-based hybrid deep learning framework for target-dependent sentiment classification
Estarrona et al. Dealing with dialectal variation in the construction of the Basque historical corpus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200327

Address after: 150001 No. 118 West straight street, Nangang District, Heilongjiang, Harbin

Patentee after: Harbin University of technology high tech Development Corporation

Address before: 150001 Harbin, Nangang, West District, large straight street, No. 92

Patentee before: HARBIN INSTITUTE OF TECHNOLOGY

TR01 Transfer of patent right