CN107885729B - Interactive machine translation method based on bilingual fragments - Google Patents

Interactive machine translation method based on bilingual fragments Download PDF

Info

Publication number
CN107885729B
CN107885729B CN201710877018.3A CN201710877018A CN107885729B CN 107885729 B CN107885729 B CN 107885729B CN 201710877018 A CN201710877018 A CN 201710877018A CN 107885729 B CN107885729 B CN 107885729B
Authority
CN
China
Prior art keywords
translation
segment
translator
bilingual
options
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710877018.3A
Other languages
Chinese (zh)
Other versions
CN107885729A (en
Inventor
叶娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Aerospace University
Original Assignee
Shenyang Aerospace University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Aerospace University filed Critical Shenyang Aerospace University
Priority to CN201710877018.3A priority Critical patent/CN107885729B/en
Publication of CN107885729A publication Critical patent/CN107885729A/en
Application granted granted Critical
Publication of CN107885729B publication Critical patent/CN107885729B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/44Statistical methods, e.g. probability models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/51Translation evaluation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to an interactive machine translation method based on bilingual fragments, which comprises the following steps: establishing a mathematical model: for each source language snippet, providing a plurality of translation options to the translator, wherein an optimal translation is obtained through a mathematical model; designing an interpreter interface: the translation system comprises an interactive area and an editing area, wherein the interactive area gives source sentences and translation options after phrases are segmented, and the editing area gives machine translation when a translator finishes confirmation and clicks a translation button; and (3) decoding: capturing each segment f of the translator after the translator has completed the validation of the bilingual segment in the interaction zoneiThe translation option is selected, and the current segmentation result of the source sentence is used for realizing a phrase-based statistical machine translation decoder through a multi-stack decoding algorithm. The invention improves the interactive protocol, allows the translator to confirm the bilingual segment, provides more clues for the translator, gives more direct guidance to the decoder, reduces human labor in the human-computer interaction process, and promotes the improvement of the interactive machine translation efficiency and the translation quality.

Description

Interactive machine translation method based on bilingual fragments
Technical Field
The invention relates to a natural language translation technology, in particular to an interactive machine translation method based on bilingual fragments.
Background
Statistical machine translation and neural machine translation techniques have resulted in a significant improvement in the performance of machine translation systems. However, in many tasks with higher quality requirements, the output quality of machine translation is still insufficient and must be modified by a human translator during post-editing before it can be used.
To enhance human-computer collaboration, Foster proposes interactive machine translation techniques. In an interactive machine translation system, a modification-prediction process is repeated. First, an interactive machine translation system provides an initial translation. The translator then confirms the longest correct prefix in it and modifies the next word. Next, the system predicts a new suffix that is expected to be better than previously. This process is repeated until a correct translation is obtained.
Recently, this left-to-right protocol (i.e., the interaction process described in the above paragraph) has been extended to make human-computer interaction more flexible. In an extended protocol, an interpreter may identify the fragments that should be retained in the translation. However, this protocol still suffers from three problems: first, the location of the confirmed segment is unknown, so the search process can only be optimized in the form of a soft constraint; second, the translator's confirmation is limited to translations provided by the system and no clues about other translation options are available; third, identifying the correct segment from the incorrect translation often requires a great deal of cognitive effort, especially when the translation is of low quality.
Disclosure of Invention
In view of the above problems in the prior art, the present invention is to provide a bilingual segment-based interactive machine translation method that can provide more clues to the translator and give the decoder more direct guidance.
In order to solve the technical problems, the invention adopts the technical scheme that:
the invention relates to an interactive machine translation method based on bilingual fragments, which comprises the following steps:
1) establishing a mathematical model: for each source language snippet, providing a plurality of translation options to the translator, wherein an optimal translation is obtained through a mathematical model;
2) designing an interpreter interface: the translation system comprises an interactive area and an editing area, wherein the interactive area provides source sentences after the phrases are segmented and a plurality of translation options provided for a translator in the step 1), and the editing area provides machine translation when the translator finishes confirming and clicks a translation button;
3) and (3) decoding: capturing each segment f of the translator after the translator has completed the validation of the bilingual segment in the interaction zoneiThe translation option is selected, and the current segmentation result of the source sentence is used for realizing a phrase-based statistical machine translation decoder through a multi-stack decoding algorithm.
The mathematical model is implemented by the following formula:
Figure GDA0002921083520000021
wherein eiIs confirmed by the translatoriCorrect translation of (a), (b), (c) and (d)iThe source language segment is the ith source language segment, t is a candidate translation, N is the number of bilingual segments, i is the sequence number of the bilingual segments, P is the translation probability of the candidate translation, and s is a source sentence.
The translator interface also has three auxiliary functions, namely fragment splitting-merging, translation option reordering and suffix prediction, wherein the fragment splitting-merging is that two bidirectional arrows are arranged above each fragment, one bidirectional outward-pointing arrow is a splitting arrow, and the fragment is split into two shorter fragments; another type of bi-directional inwardly pointing arrow is a merge arrow that merges the current segment and the next segment of the current segment into a longer segment.
If no shorter or longer segments are present in the phrase table, then both double-headed arrows do not appear; otherwise if there is a shorter or longer segment in the phrase table, the arrow appears when the mouse is placed over the segment.
The translation options are reordered as: the translator selects either the default mode or the reordering mode before starting the translation; when a new segment is generated, the translation options are changed, and the options of the segment are arranged and displayed according to the sequence in the phrase table under the default condition; when the reordering mode is selected, the top N translation options in the phrase table are reordered to generate a new option list.
The reordering is:
setting a new option list T for each source language phrase p, firstly adding the option with the highest score in the original phrase list into T, then traversing the rest N-1 options, finding the option with the highest diversity with the option in T, adding into T, repeatedly traversing the rest options, finding the option with the highest diversity with the option in T, and adding into T until the N options are reordered; translation option taAnd tbThe diversity between is calculated by the following formula:
Figure GDA0002921083520000022
wherein c (t)a,tb) Is translation option taAnd translation option tbThe number of repeated words in between, and a and b are the number of translation options.
The suffix prediction is: in the editing area, the translator clicks a 'forecast' button to obtain a forecast suffix from the system; when a button is clicked, the current position of the cursor is recorded, and the character in front of the cursor is used as a prefix; the confirmed bilingual segment and the prefix are used as constraint conditions to find an optimal suffix; when a new suffix is generated, the current suffix is replaced.
If the decoder does not find any matching candidate translations, the suffix is not altered.
The decoding includes the following processes:
construct a set as a constraint for decoding:
C={S,<p1,f1,e1>,<p2,f2,e2>,...,<pN,fN,eN>} (3)
wherein p isiIs a fragment fiA location in a source sentence; f. ofiRepresenting segments, S being the current segmentation of the source sentence, eiIs confirmed by the translatoriCorrect translation of (2); n is the number of bilingual fragments.
Taking S as the only segmentation result of the source sentence in the decoding process;
translation options for each source language phrase or segment are set by<pi,fi,ei>Subject to a restriction of only eiWill be retained and participate in the subsequent decoding process.
The translator must click on an option to make this option a confirmed bilingual segment with its source language segment, and if any translation option for a segment is not clicked, then this segment and its options cannot be used as decoding constraints.
The invention has the following beneficial effects and advantages:
1. the invention improves the interactive protocol, allows the translator to confirm the bilingual segment, provides more clues for the translator, gives a decoder more direct guidance, reduces human labor in the human-computer interaction process, promotes the improvement of the interactive machine translation efficiency and the translation quality, and is easier to confirm the bilingual segment than to identify the correct segment from the wrong translation.
2. The invention also designs an interface facing the real translator, allows the translator to split and combine the split phrases, and provides a reordering method for increasing the diversity of translation options, which is helpful for improving the interactive translation efficiency in the real scene. The experimental results of the real translator show that the new protocol improves the efficiency and the quality of the interactive machine translation on the three Chinese-English translation tasks.
Drawings
FIG. 1 is a diagram of an example bilingual-segment-based interactive machine translation protocol according to the present invention;
FIG. 2 is a diagram of an interpreter interface of a bilingual-segment-based interactive machine translation system of the present invention.
Detailed Description
The invention is further elucidated with reference to the accompanying drawings.
Aiming at the problems in the interactive machine translation, the interactive protocol is improved, the translator is allowed to confirm the bilingual segment, more clues are provided for the translator, the decoder is given more direct guidance, the human labor in the human-computer interaction process is reduced, and the interactive machine translation efficiency and the translation quality are improved.
The invention relates to an interactive machine translation method based on bilingual fragments, which comprises the following steps:
1) establishing a mathematical model: for each source language snippet, providing a plurality of translation options to the translator, wherein an optimal translation is obtained through a mathematical model;
2) designing an interpreter interface: the translation system comprises an interactive area and an editing area, wherein the interactive area provides source sentences after the phrases are segmented and a plurality of translation options provided for a translator in the step 1), and the editing area provides machine translation when the translator finishes confirming and clicks a translation button;
3) solution (II)Code: capturing each segment f of the translator after the translator has completed the validation of the bilingual segment in the interaction zoneiThe translation option is selected, and the current segmentation result of the source sentence is used for realizing a phrase-based statistical machine translation decoder through a multi-stack decoding algorithm.
In step 1), the source language segments are aligned with their target language corresponding segments. For each source language segment, multiple translation options are provided. The interpreter can confirm the shape of<fi,ei>Bilingual fragments of (c). The optimal translation is obtained by the following formula:
Figure GDA0002921083520000031
wherein eiIs confirmed by the translatoriCorrect translation of (a), (b), (c) and (d)iRepresenting each segment, t is a candidate translation, N is the number of bilingual segments, i is the sequence number of the bilingual segment, P is the translation probability of the candidate translation, and s is a source sentence.
In equation (1), the search space is the translation hypothesis that coincides with these bilingual segments.
As shown in fig. 1, an example of a new protocol is given. The interpreter has confirmed three bilingual segments (i.e., boxed portions of the graph), and the decoder has given a better translation; the translator then enters a prefix "a" and decodes IT again, resulting in the correct translation IT-2.
Step 2), the present invention employs an interpreter interface as shown in FIG. 2. The interface consists of two areas, one is an interactive area, wherein a source sentence after phrase segmentation and translation options are given, and a segment and the options are left-aligned. When the mouse is placed on a source language segment, a menu with K-best translation options is displayed, and the translator can click to confirm the most preferred option; the other is an edit section that gives the machine translation when the translator completes the confirmation and clicks the "translate" button. Where the translator may make modifications at will until the translation is accepted. The interactive process and the editing process may be alternated.
One prominent feature of phrase-based statistical machine translation is the extraction of translations of longer phrases. The long phrases are used as basic translation units, so that the problem of word disambiguation can be effectively relieved, and a good effect is achieved. Therefore, longer segments and their translations are preferentially displayed in the interface, and the source sentences are initially segmented using the phrasal table using the forward maximum matching algorithm. The translation options displayed are the top K options in the phrase table.
The interpreter interface also provides three ancillary functions: segment split-merge, translation option reorder, and suffix prediction.
a. Fragment splitting-merging
The fragment splitting-merging is that two bidirectional arrows are arranged above each fragment, one bidirectional outward-pointing arrow is a splitting arrow, and the fragment is split into two shorter fragments; another type of bi-directional inwardly pointing arrow is a merge arrow that merges the current segment and the next segment of the current segment into a longer segment.
If no shorter or longer segments are present in the phrase table, the arrow does not appear. Otherwise, when the mouse is placed over the segment, the arrow will appear. Once a new segment is generated, its translation options are changed.
b. Translation option reordering
By default, the options for the snippet are arranged and displayed in the order in the phrase table. However, the highest scoring options are sometimes very similar. The invention thus provides an alternative mode, increasing the variety of options. The translator may select either the default mode or the reorder mode before beginning the translation.
In this mode, the top N translation options in the phrase table are reordered to produce a new list of options. For each source language phrase p, a new option list T (initially empty) is set. First, the option with the highest score in the original phrase table is added to T. Then, the rest N-1 options are traversed, the option with the highest diversity with the option in the T is found and added into the T. This process is repeated until all N options are reordered. Translation option taAnd tbThe diversity between is calculated by the following formula:
Figure GDA0002921083520000051
wherein c (t)a,tb) Is taAnd tbThe number of repeated words (after the word shape is restored), and a and b are the serial numbers of the translation options.
c. Suffix prediction
For the auxiliary function of postfix prediction, a constraint is added in the decoder, namely, the translation hypothesis must match the given prefix tp
In the edit section, the translator may click on the "predict" button to obtain the predicted suffix from the system. When the button is clicked, the current position of the cursor is recorded, and the character in front of the cursor is used as a prefix. Both the confirmed bilingual fragment and the prefix are used as constraints to find the optimal suffix. Once a new suffix is generated, it will replace the current suffix. If the decoder does not find any compatible assumptions, the suffix is not altered.
In step 3), the decoding process is as follows:
after the translator completes the validation of bilingual snippets in the interaction zone, the system captures the translator's recognition of each snippet fiAnd the current segmentation result S of the source sentence. Construct a set as a constraint for decoding:
C={S,<p1,f1,e1>,<p2,f2,e2>,...,<pN,fN,eN>} (3)
wherein p isiIs a fragment fiA location in a source sentence; f. ofiRepresenting individual segments, S being the translator for each segment fiSelection of translation options and current segmentation result of source sentence, eiIs confirmed by the translatoriCorrect translation of (2);
taking S as the only segmentation result of the source sentence in the decoding process;
translation options for each source language phrase or segment are set by<pi,fi,ei>Subject to a restriction of only eiWill be retained and participate in the subsequent decoding process.
Record piTo avoid ambiguity caused by multiple occurrences of a segment. The translator must click on the option to make this option a confirmed bilingual snippet with its source language snippet. If any translation option for a segment has not been clicked on, then the segment and its options cannot be used as decoding constraints.
Table 1 shows a comparative example of a real interactive machine translation.
TABLE 1 Interactive machine translation protocol COMPARATIVE EXAMPLE
Figure GDA0002921083520000052
Figure GDA0002921083520000061
In this embodiment, the prefix-based protocol undergoes 6 decodings, including 2 temporal changes ("study" and "contider"), 1 leaky word appends ("functions"), and 1 word order adjustment ("of"). In contrast, the protocol of the present invention decodes only twice after confirming the bilingual segment, and the correct translation options for the content word are all displayed in the list. The translator may click on them directly for confirmation.
(1) Data setting
The present invention tests three different chinese-english translation tasks with a real translator. "Law" is the legal text of the LDC2000T47 corpus. The "meeting record" is the meeting record text of the LDC2000T50 corpus. "News" is the news text of the LDC2000T46 corpus. Table 2 gives the main information for these corpora (S, T and V indicate the number of sentences, the number of words, and the size of the vocabulary, respectively.K and M represent thousands and ten thousand, respectively).
TABLE 2 Main information of test corpus
Figure GDA0002921083520000062
Figure GDA0002921083520000071
The Chinese part of the data is preprocessed by an ICTCCLAS word segmentation tool, and the English part is marked and lowercase. A word alignment model is trained by GIZA + +, a 5-gram language model is trained by IRSTLM, a phrase-based statistical machine translation model is constructed by Moses, wherein the phrase-based statistical machine translation model comprises 14 default features, and feature weights are adjusted by MERT.
Three interactive machine translation systems were evaluated in the experiment. Baseline is a prefix-based system, BiSeg is a system without an option reordering function, and BiSeg + D is a system with an option reordering function. In the interpreter interface, the number of translation options displayed is set to 10 and the number of reordered translation options is set to 20.
(2) Evaluation index
In the field of interactive machine translation, because the experimental cost of a real translator is high, an automatic evaluation index is mainly adopted to evaluate a prototype system. In these metrics, translator behavior is simulated, rather than actual translator behavior during interaction. However, direct evaluation of interactive machine translation systems still requires experimentation by real translators. The invention evaluates the performance of the interactive machine translation system by a real translator from the aspects of efficiency and quality. Three indices were used to evaluate translation efficiency: translation time, keyboard stroke and mouse behavior rate (KSMR), and number of decodes.
And evaluating the translation quality by using a BLEU value, and evaluating the translation quality of a translator by using an English part in the original bilingual corpus as a reference translation. The final translation received by the translator is correct, although not identical to the reference translation.
(3) Participants and processes
9 investigators (6 women) volunteered to participate in the experiment as non-professional translators. They all use Chinese as mother languageThe man of (1) is proficient in English. This example randomly groups participants into 3 groups (G)1~G3) 3 people per group. The test set of each corpus is randomly divided into 3 parts (C)1~C3) There are 25 sentences per part. The evaluation was performed in a balanced manner as shown in table 3.
TABLE 3 translation task alignment
Figure GDA0002921083520000072
(4) Results and analysis
Table 4 shows the average time for three translator groups on the test corpus. The numbers in parentheses are the relative differences between the inventive system and the baseline system.
TABLE 4 translation time for different interactive machine translation systems
Figure GDA0002921083520000081
It can be seen that the translation time of the inventive system is significantly lower than the baseline system. This indicates a significant reduction in human labor. The variety of translation options may further reduce human labor.
Table 5 gives the KSMR values over three corpora.
TABLE 5 KSMR values for different interactive machine translation systems
Figure GDA0002921083520000082
It can be seen that the KSMR values of the inventive system are significantly higher than the baseline system. However, these mouse actions do not take much thought and action time, so they have little impact on translation efficiency.
Table 6 gives the number of evaluated decodes over three corpora.
TABLE 6 decoding times for different interactive machine translation systems
Figure GDA0002921083520000083
Table 6 shows that the number of decodings in the new protocol is significantly reduced.
The translation quality (BLEU value) over the three corpora is given in table 7.
TABLE 7 translation quality for different interactive machine translation systems
Figure GDA0002921083520000091
The results show that the translation quality of the system of the invention is better than that of the baseline system.

Claims (8)

1. An interactive machine translation method based on bilingual fragments is characterized by comprising the following steps:
1) establishing a mathematical model: for each source language snippet, providing a plurality of translation options to the translator, wherein an optimal translation is obtained through a mathematical model;
2) designing an interpreter interface: the translation system comprises an interactive area and an editing area, wherein the interactive area provides source sentences after the phrases are segmented and a plurality of translation options provided for a translator in the step 1), and the editing area provides machine translation when the translator finishes confirming and clicks a translation button;
3) and (3) decoding: capturing each segment f of the translator after the translator has completed the validation of the bilingual segment in the interaction zoneiThe translation options are selected, and the current segmentation result of the source sentences is obtained through a multi-stack decoding algorithm to realize a phrase-based statistical machine translation decoder;
the decoding includes the following processes:
construct a set as a constraint for decoding:
C={S,<p1,f1,e1>,<p2,f2,e2>,...,<pN,fN,eN>} (3)
wherein p isiIs a fragment fiBits in Source clausePlacing; f. ofiFor the ith source language snippet, S is the current segmentation result of the source sentence, eiIs confirmed by the translatoriCorrect translation of (2); n is the number of bilingual fragments;
taking S as the only segmentation result of the source sentence in the decoding process;
translation options for each source language phrase or segment are set by<pi,fi,ei>Subject to a restriction of only eiWill be retained and participate in the subsequent decoding process;
the mathematical model is implemented by the following formula:
Figure FDA0002921083510000011
wherein t is a candidate translation, i is a bilingual fragment sequence number, and s is a source sentence.
2. The bilingual segment-based interactive machine translation method of claim 1, wherein: the translator interface also has three auxiliary functions, namely fragment splitting-merging, translation option reordering and suffix prediction, wherein the fragment splitting-merging is that two bidirectional arrows are arranged above each fragment, one bidirectional outward-pointing arrow is a splitting arrow, and the fragment is split into two shorter fragments; another type of bi-directional inwardly pointing arrow is a merge arrow that merges the current segment and the next segment of the current segment into a longer segment.
3. The bilingual segment-based interactive machine translation method of claim 2, wherein: if no shorter or longer segments are present in the phrase table, then both double-headed arrows do not appear; otherwise if there is a shorter or longer segment in the phrase table, the arrow appears when the mouse is placed over the segment.
4. The bilingual segment-based interactive machine translation method of claim 2, wherein the reordering of translation options is: the translator selects either the default mode or the reordering mode before starting the translation; when a new segment is generated, the translation options are changed, and the options of the segment are arranged and displayed according to the sequence in the phrase table under the default condition; when the reordering mode is selected, the top N translation options in the phrase table are reordered to generate a new option list.
5. The bilingual segment-based interactive machine translation method of claim 4, wherein the reordering comprises:
setting a new option list T for each source language phrase p, adding the option with the highest score in the original phrase list into T, traversing the rest N-1 options, finding the option with the highest diversity with the option in T, and adding the option into T;
repeatedly traversing the rest options, finding the option with the highest diversity with the options in the T, and adding the option into the T until the N options are reordered; translation option taAnd tbThe diversity between is calculated by the following formula:
Figure FDA0002921083510000021
wherein c (t)a,tb) Is translation option taAnd translation option tbThe number of repeated words in between, and a and b are the number of translation options.
6. The bilingual segment-based interactive machine translation method of claim 4, wherein the suffix prediction is: in the editing area, the translator clicks a 'forecast' button to obtain a forecast suffix from the system; when a button is clicked, the current position of the cursor is recorded, and the character in front of the cursor is used as a prefix; the confirmed bilingual segment and the prefix are used as constraint conditions to find an optimal suffix; when a new suffix is generated, the current suffix is replaced.
7. The bilingual segment-based interactive machine translation method of claim 6, wherein: if the decoder does not find any matching candidate translations, the suffix is not altered.
8. The bilingual segment-based interactive machine translation method of claim 1, wherein: the translator must click on an option to make this option a confirmed bilingual segment with its source language segment, and if any translation option for a segment is not clicked, then this segment and its options cannot be used as decoding constraints.
CN201710877018.3A 2017-09-25 2017-09-25 Interactive machine translation method based on bilingual fragments Active CN107885729B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710877018.3A CN107885729B (en) 2017-09-25 2017-09-25 Interactive machine translation method based on bilingual fragments

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710877018.3A CN107885729B (en) 2017-09-25 2017-09-25 Interactive machine translation method based on bilingual fragments

Publications (2)

Publication Number Publication Date
CN107885729A CN107885729A (en) 2018-04-06
CN107885729B true CN107885729B (en) 2021-05-11

Family

ID=61780796

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710877018.3A Active CN107885729B (en) 2017-09-25 2017-09-25 Interactive machine translation method based on bilingual fragments

Country Status (1)

Country Link
CN (1) CN107885729B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753558B (en) * 2020-06-23 2022-03-04 北京字节跳动网络技术有限公司 Video translation method and device, storage medium and electronic equipment
CN111666776B (en) * 2020-06-23 2021-07-23 北京字节跳动网络技术有限公司 Document translation method and device, storage medium and electronic equipment
CN115345180A (en) * 2021-05-14 2022-11-15 阿里巴巴新加坡控股有限公司 Data processing method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1855211A3 (en) * 2006-05-10 2009-01-21 Xerox Corporation Machine translation using elastic chunks
CN104462072A (en) * 2014-11-21 2015-03-25 中国科学院自动化研究所 Input method and device oriented at computer-assisting translation
CN105320651A (en) * 2014-08-05 2016-02-10 张龙哺 Human-machine interactive translation method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1855211A3 (en) * 2006-05-10 2009-01-21 Xerox Corporation Machine translation using elastic chunks
CN105320651A (en) * 2014-08-05 2016-02-10 张龙哺 Human-machine interactive translation method and device
CN104462072A (en) * 2014-11-21 2015-03-25 中国科学院自动化研究所 Input method and device oriented at computer-assisting translation

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Interactive-Predictive Translation based on Multiple Word-Segments;Miguel DOMINGO 等;《European Association for Machine Translation 2016》;20161231;第4卷;第282-291页 *
基于正向反馈的交互式机器翻译技术研究;徐萍;《中国优秀硕士学位论文全文数据库》;20180515;第I138-552页 *
基于正向多约束的交互式机器翻译技术研究;付一韬;《中国优秀硕士学位论文全文数据库》;20170315;第I138-5987页 *
基于知识管理的交互式机器翻译系统的研究与实现;张桂平;《万方数据库》;20131128;第1-105页 *

Also Published As

Publication number Publication date
CN107885729A (en) 2018-04-06

Similar Documents

Publication Publication Date Title
CN110164435B (en) Speech recognition method, device, equipment and computer readable storage medium
Knowles et al. Neural interactive translation prediction
JP5774751B2 (en) Extracting treelet translation pairs
CN104462072B (en) The input method and device of computer-oriented supplementary translation
CN107885729B (en) Interactive machine translation method based on bilingual fragments
US8280718B2 (en) Method to preserve the place of parentheses and tags in statistical machine translation systems
CN101369216B (en) Words input method and system
US20180329894A1 (en) Language conversion method and device based on artificial intelligence and terminal
KR20190039817A (en) Neural Machine Translation System
CN1465018A (en) Machine translation mothod
JP5586817B2 (en) Extracting treelet translation pairs
Jehl et al. Twitter translation using translation-based cross-lingual retrieval
CN105068997B (en) The construction method and device of parallel corpora
CN101714136B (en) Method and device for adapting a machine translation system based on language database to new field
CN105573994B (en) Statictic machine translation system based on syntax skeleton
CN111814493B (en) Machine translation method, device, electronic equipment and storage medium
Litvak et al. Museec: A multilingual text summarization tool
CN112329482A (en) Machine translation method, device, electronic equipment and readable storage medium
JP2019036093A (en) Model learning device, conversion device, method, and program
JP2009294747A (en) Statistical machine translation device
Ha et al. The KIT translation systems for IWSLT 2015
CN110534115B (en) Multi-party mixed voice recognition method, device, system and storage medium
JP2018045686A (en) Machine translation device and machine translation method
CN114429136A (en) Text error correction method
KR20240008930A (en) Improving datasets by predicting machine translation quality

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant