CN104462072B - The input method and device of computer-oriented supplementary translation - Google Patents
The input method and device of computer-oriented supplementary translation Download PDFInfo
- Publication number
- CN104462072B CN104462072B CN201410678005.XA CN201410678005A CN104462072B CN 104462072 B CN104462072 B CN 104462072B CN 201410678005 A CN201410678005 A CN 201410678005A CN 104462072 B CN104462072 B CN 104462072B
- Authority
- CN
- China
- Prior art keywords
- translation
- phrase
- candidate
- input
- input method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013519 translation Methods 0.000 title claims abstract description 408
- 238000000034 method Methods 0.000 title claims abstract description 176
- 230000004044 response Effects 0.000 claims description 4
- 239000000203 mixture Substances 0.000 claims 1
- 230000014616 translation Effects 0.000 description 310
- 230000002452 interceptive effect Effects 0.000 description 7
- 230000008569 process Effects 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 241000208340 Araliaceae Species 0.000 description 2
- 241001269238 Data Species 0.000 description 2
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 2
- 235000003140 Panax quinquefolius Nutrition 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 235000008434 ginseng Nutrition 0.000 description 2
- 238000013138 pruning Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000004801 process automation Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 235000002020 sage Nutrition 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Landscapes
- Machine Translation (AREA)
Abstract
The present invention is a kind of input method of computer-oriented supplementary translation, including step S1:Participle is carried out to source language sentence;Step S2:Obtain the corresponding machine translation translation candidate list of source language sentence and optimal machine translation after participle and translate adopted candidate;Obtain multi-component grammar hint phrase;Step S3:Respond button selection multi-component grammar hint phrase or receive input keystroke sequence, obtain input method phrase candidate;Step S4:Respond after user key-press selection multi-component grammar hint phrase or input method phrase candidate, obtain multi-component grammar hint phrase, repeat step S3, until user completes the translation of typing source language sentence.The present invention also provides the input unit of computer-oriented supplementary translation, and the device includes:Word-dividing mode, translation module, the first generation module, the second generation module, input unit interface.The present invention makes full use of machine translation knowledge, and button saving rate can be made at least to rise 11.04%, the efficiency of human translation is substantially improved.
Description
Technical field
The present invention relates to natural language processing technique field, more particularly, to a kind of computer-oriented supplementary translation
Input method and device.
Background technology
Machine translation is exactly that the conversion between different language is realized with computer.The language being translated is commonly referred to as source language
Speech, the object language translated into referred to as object language.Machine translation is exactly to realize the process changed from original language to object language.
Computer-aided translation fully improves translator with a large amount of repetitions or similar sentence and segment
Operating efficiency.It is different from machine translation, independent of the automatic translation of computer, but completes entirely to turn in the presence of people
Translate process.Computer-aided translation causes heavy manual translation process automation, and translation efficiency is greatly improved and turns over
Translate quality.
In recent years, many researchers attempted further to improve the effect of computer-aided translation by machine translation knowledge
Rate.The focus studied at present is post-editing, i.e. the translation progress edit operation to machine translation system is high-quality to generate
Translation.But allow people's satisfied translation relatively because current machine translation is difficult to produce, directly resulting in interpreter does not have power careful
The translation of machine translation is changed, so post-editing is not widely adopted.In addition, having scholar it is proposed that based on interactive machine
Interpretation method supplementary translation (for example, see Sergio Barrachinaetc., " Statistical Approaches to
Computer-Assisted Translation ", Computational Linguistics, 35 (1), p3-28,2009), with
Sacrifice full automatic translation brief and obtain a kind of interpretation method of better quality translation, basic thought is exactly in current translation system
Unite on translation result, user points out some mistakes and provides correct translation, is then forwarded to translation system and decodes translation again, repeatedly
Untill user's requirement is met after generation is multiple.But interactive interpretation method severe jamming human translation flow, and equally take
Arduously, therefore this kind of system is mainly used in that user is limited to the knowledge of object language or feelings that known little about it to object language
Under condition.And the main users of computer-aided translation are professional interpreters, so interactive interpretation method is almost never commercially turned over
Translate system use.Guy Lapalme and Philippe Langlais are real based on interactive translation framework between 1997-2005
Show TransType translation systems, provide the prompting of subsequent translation in real time in user's input process.But this requires that interpreter is necessary
Start translation from left to right, machine translation updates translation result according to the part inputted and accurately carried as far as possible to provide
Show.TransType2 after upgrading realizes the translation of three kinds of language pair, i.e. English → Spanish, English → French, English
→ German, but because being difficult to the flow with reference to human translation, this interactive modes of TransType2 are not used by other systems.
Therefore, it is in the urgent need to address one further to improve translation efficiency and translation quality that how research, which combines machine translation knowledge,
Individual problem.
The content of the invention
For above-mentioned technical problem, it is a primary object of the present invention to propose a kind of input of computer-oriented supplementary translation
Method and apparatus, translation efficiency and translation quality are improved can make full use of machine translation knowledge in input process.
In order to realize the purpose, as one aspect of the present invention, the invention provides a kind of computer-oriented auxiliary
The input method of translation, comprises the following steps:
Step S1:Participle is carried out to source language sentence;
Step S2:Using MT engine, the corresponding machine translation translation of the source language sentence after participle is obtained
Candidate list, and the highest machine translation translation candidate that will wherein give a mark is output to input unit as optimal machine translation translation
Interface;N number of multi-component grammar hint phrase is generated using the top n word of the optimal machine translation translation, and is output to input dress
Interface is put, user key-press selection is waited;
Step S3:The multi-component grammar hint phrase that user key-press is selected is responded, or the input of reception user is pressed
Key sequence;Using log-linear model, the machine translation translation candidate list and input keystroke sequence are calculated, generation M
Input method phrase candidate is simultaneously output to input unit interface, waits user key-press selection;
Step S4:The input method phrase candidate that user key-press is selected is responded, or receives the input of user
Keystroke sequence, judges whether user has completed the translation of typing source language sentence, if it is terminates, if otherwise using
Typing translation part and the machine translation translation candidate list generate N number of multi-component grammar hint phrase, are output to input unit
Interface, waits user key-press selection, and jump to step S3;
Wherein, N, M are positive integer.
Wherein, the multi-component grammar hint phrase includes:First hint phrase is unigram, only comprising a word;
Second hint phrase is bi-gram, comprising two words, described two words contain first hint phrase word and second
Cue, and first hint phrase word be second hint phrase prefix;By that analogy, the N-1 hint phrase
All words are the prefixes of n-th hint phrase, and n-th hint phrase is that N-gram includes N number of word, and wherein N is set in advance
Positive integer more than zero, default value is 4.
Also comprise the following steps in step s3:
Step S31:A point word is carried out to input keystroke sequence, the input keystroke sequence after point word is obtained;After described point of word
The coding unit that input keystroke sequence is separated by point character is constituted, and each coding unit is that the character input method of correspondence word is encoded
The prefix of whole or character input method coding;
Step S32:Input method phrase candidate list is initialized as sky, to every in the input keystroke sequence after described point of word
One coding unit is calculated as below successively:
According to character input method coding rule, the coding unit is calculated and obtains target word candidate collection;
Using decoding algorithm to the target word candidate collection, input method phrase candidate list and machine translation translation candidate
List is calculated, and obtains new input method phrase candidate list;
Using log-linear model to each input method phrase candidate in the new input method phrase candidate list
Given a mark and arranged in descending order;If the length of the new input method phrase candidate list exceedes the threshold value M of setting, only
M marking highest input method phrase candidate before retaining;Number of target word candidate that each input method phrase candidate includes etc.
In decoded coding unit number, the order for the effective candidate of target word that each input method phrase candidate includes with it is decoded
Coding unit sequence consensus;
The input method phrase candidate list is substituted with the new input method phrase candidate list.
Wherein, the feature that the log-linear model is used includes:
(1) typing model probability;
(2) probabilistic language model;
(3) probability of occurrence of the word in input method phrase candidate;
(4) input method phrase candidate probability of occurrence;
(5) word in input method phrase candidate whether the binary feature in machine translation translation candidate;
(6) input method phrase candidate whether the binary feature in machine translation translation candidate;
(7) input method phrase candidate whether the binary feature in user's terminology bank.
Step S33:Complete in the input keystroke sequence after described point of word after the calculating of all coding units, the input
The length of method phrase candidate list be M, and by marking descending arrangement, wherein M for it is set in advance be more than zero positive integer, it is default
It is worth for 5.
Also comprise the following steps in step s 4:
Step S41:Respond after user key-press selection multi-component grammar hint phrase or input method phrase candidate, to typing
Translation part carries out participle and obtains the translation of the typing part after participle;
Step S42:If the optimal machine translation translation includes last of the translation of the typing part after participle
Word, then using maximum-prefix matching algorithm, calculate the translation of the typing part after optimal machine translation translation candidate and participle,
Generate N number of multi-component grammar hint phrase;
Step S43:If the optimal machine translation translation does not include last of the translation of the typing part after participle
Individual word, then select all last words for including the translation of the typing part after participle in machine translation translation candidate list
Machine translation translation candidate, obtain suboptimum machine translation translation candidate list, and the highest machine translation that will wherein give a mark is translated
Literary candidate is used as suboptimum machine translation translation;Using prefix match algorithm, after suboptimum machine translation translation candidate and participle
Typing translation part is calculated, generates N number of multi-component grammar hint phrase.
As another aspect of the present invention, the invention also provides a kind of input of computer-oriented supplementary translation dress
Put, the device includes:Word-dividing mode, translation module, the first generation module, the second generation module, input unit interface, wherein:
Word-dividing mode, for typing translation part to generate and exports the source language sentence after participle by source language sentence and
With the translation of the typing part after participle;
Translation module is connected with word-dividing mode, using MT engine, obtains the source language sentence pair after participle
The machine translation translation candidate list answered, and the highest machine translation translation candidate that will wherein give a mark translates as optimal machine translation
Text is output to the module at input unit interface;
First generation module is connected with translation module, input unit interface, for machine translation translation candidate list and
Input keystroke sequence to calculate, using log-linear model, generate M input method phrase candidate and be output to input unit interface;
Second generation module is connected with translation module, input unit interface, for being turned over to typing translation part and machine
Translation candidate list is calculated, and is generated N number of multi-component grammar hint phrase and is output to input unit interface;
Input unit interface, for showing that optimal machine translation translation, input method phrase candidate and multi-component grammar prompting are short
Language, and receive user key-press select command and input keystroke sequence, the translation of typing source language sentence.
As another aspect of the invention, the invention also provides a kind of input of computer-oriented supplementary translation dress
Put, including:
The device of participle is carried out to source language sentence;
Using MT engine, the corresponding object language machine translation translation of the source language sentence after participle is obtained
Candidate list, the highest that will wherein give a mark machine translation translation candidate generation phrase candidate list, and it is output to input unit circle
The device in face;
After the keystroke sequence for receiving user's input, using log-linear model, with reference to machine translation translation candidate row
Table, the phrase candidate list of dynamic adjustment in real time and the device for being output to the input unit interface;
User key-press selection is responded, until user completes the device of source language sentence translation.
Wherein, the input unit also includes:
The device that machine translation candidate list obtains N-gram prompting is combined after one phrase of user's typing;And
The N-gram prompting, the device selected for user are shown in interface of input method.
According to the above-mentioned technical solution, methods and apparatus of the present invention has following good effect:
(1) because input method directly influences translation efficiency, by machine translation knowledge and computer-oriented supplementary translation
Input method is dissolved into character input method, and can smoothly breaking through existing interactive mode, (such as post-editing, interactive machine is turned over
Translate) limitation so that on the premise of Consumer's Experience is not influenceed, more efficiently input method can must further improve interpreter's
Translation efficiency and translation quality;
(2) present invention can effectively utilize machine translation knowledge, use the computer-aided translation containing machine translation
It is automatic effectively to reduce number of stroking on the premise of normal translation flow is not disturbed during instrument.Turned over by English-Chinese political news
Experiment is translated, is as a result shown, relative to Google's spelling input method, stroking for quantization is singly easy for and counts this index, the present invention is at least
Button saving rate is risen 11.04%, 11.04% is at least improved equivalent to operating efficiency.If by machine translation translation
Interpreter is helped to organize the effect of final translation to count faster, improved efficiency then becomes apparent.
Brief description of the drawings
Fig. 1 is the input method and the general frame figure of device of the computer-oriented supplementary translation of the present invention;
Fig. 2 be the present invention computer-oriented supplementary translation input method and device refinement after general frame figure;
Fig. 3 is that the inventive method and device are embedded into the schematic diagram after computer-aided translation platform;
Fig. 4 is the input keystroke sequence for disabling multi-component grammar hint phrase and enabling two kinds of situations of multi-component grammar hint phrase
Contrast schematic diagram;
Fig. 5 is that the present invention combines an example to input keystroke sequence decoding after machine translation knowledge;
Embodiment
For the object, technical solutions and advantages of the present invention are more clearly understood, below in conjunction with specific embodiment, and reference
Accompanying drawing, the present invention is described in further detail.
All codes of the present invention, which are realized, to be completed with Java and Apache Flex programming languages, and backstage programs for Java
Language, container is Tomcat, and input method foreground is completed with Apache Flex programming languages, and development platform is Ubuntu 12.04
With Windows 7, but not limited to this, these are not limitation of the present invention;Any platform is not used by programming
Related code, therefore described system realizes and can also run in the operating system of other versions.This input method is face
To computer-aided translation, the input method merged with character input method, non-universal character input method.Specific area of computer aided
Translation software, MT engine, character input method are unrestricted.The character input method can be five-stroke input method, phonetic
The various character input methods such as input method.
The basic thought of the present invention is rightly to utilize machine translation knowledge, proposes a kind of computer-oriented supplementary translation
Input method, to improve the translation quality and translation efficiency of interpreter.The system framework figure of the present invention is as shown in Figure 1.In Fig. 1:Point
Word module receives the source language sentence after source language sentence, output participle to translation module;Word-dividing mode has received artificial translation
Artificial translation after typing part, output participle typing part to the second generation module;Translation module and word-dividing mode, second
Generation module is connected, the corresponding machine translation translation candidate list of source language sentence after output participle to the first generation module;
First generation module is connected with translation module, input unit interface, is received the input keystroke sequence of user, machine translation translation and is waited
List is selected, generates and exports input method phrase candidate to input unit interface;Second generation module and word-dividing mode, translation module
Connection, receives the artificial translation after participle typing part and machine translation translation candidate list, generates and export multi-component grammar
Hint phrase is to input unit interface;Input unit interface directly and user mutual, for showing optimal machine translation translation, defeated
Enter method phrase candidate and multi-component grammar hint phrase, and receive user key-press select command and input keystroke sequence, typing source language
Say the translation of sentence.
It is soft that Fig. 3 gives an example of the present invention (it is assumed that character input method is spelling input method) embedded area of computer aided
Schematic diagram after part.Fig. 3 is broadly divided into two pieces of regions of A, B or so.A-quadrant is that machine translation translation candidate list is joined for user
Examine, user can set the number of display machine translation translation candidate.B regions are main function region of the present invention.When user is firm
When starting typing translation or having the multi-component grammar hint phrase can use, user can pass through enter key or the selection of numerical key 5 to 8
Corresponding prompting, as shown in the B1 of region.In the B2 of region, when no multi-component grammar hint phrase is available, machine translation is still
By the present invention user can be helped to improve efficiency:The preferential word by machine translation translation candidate list assigns higher score value,
Such as " fl " corresponding " welfare " directly row first place, it is to avoid select the trouble of word.Therefore, the present invention not only can be explicitly by more
First syntax hint phrase accelerates translation efficiency, can also come implicitly by real-time putting in order for adjustment input method candidate phrase
Accelerate translation efficiency.Unlike translating exchange method from other machines, if the machine translation of a-quadrant is set to invisible
State, i.e. user, which are completely dispensed with, comprehends machine translation result, and the present invention can still help user to improve translation efficiency.
The present invention proposes a kind of input method of computer-oriented supplementary translation.We are substituted with spelling input method below
The character input method, using English to Chinese translation duties as embodiment, and combines following example to elaborate the present invention
Principle and implementation method.
Assuming that source language sentence S:
China mulls change to officials’welfare system
One of machine translation translation candidate MT:
China considers to change ability official's benefit system
Corresponding artificial translation HT:
China considers reform civil servants' welfare system
1st, to source language sentence and, typing translation part carries out participle.Embodiment is as follows:
In this example, there are many kinds to the method that English and Chinese carry out participle.In an embodiment of the present invention we with
The participle instrument Urheen increased income carries out participle to English and Chinese.The Urheen can also carry out participle to other Languages,
Such as Japanese, can be freely downloaded in following network address:
http://www.openpr.org.cn/index.php/zh/NLP-Toolkit-For-Natural-Langua
ge-Processing/68-Urheen-A-Chinese/English-Lexical-Analysis-Toolkit/View-d
etails.html
In this example, machine translation translation candidate and artificial translation automatic word segmentation, and with space space between adjacent word.
2nd, using MT engine, the corresponding machine translation translation candidate row of the source language sentence after participle are obtained
Table, and the highest machine translation translation candidate that will wherein give a mark is output to input unit interface as optimal machine translation translation;
N number of multi-component grammar hint phrase is generated using the top n word of the optimal machine translation translation, and is output to input unit circle
Face, waits user key-press selection.
(1) machine translation translation candidate list is obtained.
After the step 1 obtains the source language sentence after participle, it is possible to obtain machine by MT engine
Translate translation candidate list, i.e. n-best lists.Using the highest machine translation translation candidate that given a mark in n-best lists as most
Excellent machine translation translation is simultaneously output to input unit interface, for reference, waits user's typing human translation translation.Here
MT engine can be any translation engine, such as the famous translation engine Moses that increases income, can be in the case where following network address is free
Carry:
http://www.statmt.org/moses/N=Moses.Releases
The Moses possesses fairly perfect document, and translating server can be easily disposed according to these documents.
(2) N number of multi-component grammar hint phrase is generated using the top n word of the optimal machine translation translation.
N number of multi-component grammar hint phrase is made up of continuous multiple words, and the multi-component grammar hint phrase includes:First
Individual hint phrase is unigram, only comprising a word;Second hint phrase is bi-gram, includes two words, described two
Individual word contains before the word and second cue of first hint phrase, and first hint phrase, second hint phrase
Sew;By that analogy, all words of the N-1 hint phrase are the prefixes of n-th hint phrase, and n-th hint phrase is N member texts
Method include N number of word, wherein N for it is set in advance be more than zero positive integer.N default value is 4 in embodiment, be can customize.Show
In example, generating 4 multi-component grammar hint phrases using the top n word of the optimal machine translation translation is:" China ", " China
Consideration ", " China's consideration changes ", " China's consideration changes ability ".4 multi-component grammar hint phrases are output to input dress
Put behind interface, 4 multi-component grammar hint phrases and its serial number:5. Chinese, 6. China consider, 7. China are considered in change, 8.
State considers to change ability.User can by the corresponding multi-component grammar hint phrase of digital key selection corresponding with sequence number,
Such as press numerical key " 6 " selection " China considers ".
3rd, response user key-press selects corresponding multi-component grammar hint phrase, or receives the input keystroke sequence of user;
Using log-linear model, the machine translation translation candidate list and input keystroke sequence are calculated, M input method of generation is short
Language candidate is simultaneously output to input unit interface, waits user key-press selection.
In this example, because character input method used is spelling input method, then the input keystroke sequence refers to user's input
Character input method coding be Chinese phonetic alphabet string, such as " China consider " corresponding " zhongguokaolv ".
Step S31:A point word is carried out to input keystroke sequence, the input keystroke sequence after point word is obtained;After described point of word
The coding unit that input keystroke sequence is separated by point character is constituted, and each coding unit is that the character input method of correspondence word is encoded
The prefix of whole or character input method coding.
Pinyin character string is pressed chinese character, with " ' " cut for point character.Such as pinyin string " zhongguokaolv " is cut
Into " pinyin string " zgkl " is cut into " z ' g'k'l " by zhong'guo'kao ' lv ".Word algorithm is divided to use the maximum based on trie trees
Prefix match algorithm (detailed description is shown in document D.E.Knuth, " The art of Computer Programming ", vol.1,
pp.295-304;" Sorting and Searching ", Fundamental Algorithms, vol.III, pp.481-505,
Addison-Wesley Reading Mass, 1973).
Step S32:Input method phrase candidate list is initialized as sky, to every in the input keystroke sequence after described point of word
One coding unit is calculated as below successively:
Step S321:According to character input method coding rule, the coding unit is calculated and obtains target word candidate collection.
As pinyin string " in z ' g'k'l ", " z " correspondence Chinese character be target word candidate collection ", this, again, in, most, do, word, morning,
Make, person ... ", " g " correspondence target word candidate collection " crosses, is somebody's turn to do, gives, it is individual, more, height, with, firm, each, dry, state ... ", " k " is right
Answer target word candidate collection " can, see, soon, open, block, examining, empty, fast, visitor ... ", " l " correspondence target word candidate collection " come,
Lee, it is inner, old, consider, road, class, woods ... ".
Step S322:The target word candidate collection, input method phrase candidate list and machine are turned over using decoding algorithm
Translation candidate list is calculated, and obtains new input method phrase candidate list.
For the present embodiment, decoding refers to the input keystroke sequence after point word (as " China considers " is corresponding
" zhong'guo'kao ' lv ") it is converted into the process of corresponding input method phrase candidate.Here input keystroke sequence can be
Spelling or simplicity or Two bors d's oeuveres.An object of the present invention is by " zhong'guo'kao ' lv " are this long
Keystroke sequence is reduced to that most short " z ' g'k'l ", character input method can not be accomplished when this submits this patent as far as possible.
It is defeated after each coding unit combination because the target word candidate collection search space of each coding unit is very big
Enter method phrase number of candidates exponentially to rise, it is necessary to which (such as post searches for decoding algorithm, and detailed description is shown in document using decoding algorithm
Och, Franz Josef, Nicola Ueffing, and Hermann Ney, " An EfficientA*Search
Algorithm for Statistical Machine Translation ", vol.1, pp.295-304;“Sorting and
Searching ", Proceedings ofthe workshop on Data-driven methods in machine
Translation-Volume 14.Association for Computational Linguistics, 2001) quickly search
The target word Candidate Set of each coding unit of rope merges extension input method phrase candidate.
Step S323:Using log-linear model to each input method in the new input method phrase candidate list
Phrase candidate is given a mark and arranged in descending order;If the length of the new input method phrase candidate list exceedes the threshold of setting
During value M, M marking highest input method phrase candidate before only retaining;The target word candidate that each input method phrase candidate includes
Number be equal to decoded coding unit number, the order for the effective candidate of target word that each input method phrase candidate includes with
Decoded coding unit sequence consensus.
In the target word Candidate Set merging extension input method phrase candidate with each coding unit of decoding algorithm fast search
During, because the length of input method phrase candidate list exponentially rises, it is therefore necessary to its beta pruning, by its length
It is limited within certain limit.During beta pruning, using log-linear model, (detailed description is shown in document Knoke, David, and
Peter J.Burke, eds, " Log-linear Models ", vol.20, Sage, 1980) the new input method phrase is waited
Each input method phrase candidate in list is selected to be given a mark and arranged in descending order.Arranged with the new input method phrase candidate
Table substitutes the input method phrase candidate list.
Assuming that the input keystroke sequence after point word isCorrespondence input method phrase candidate collection is H,
The input method phrase candidate of wherein maximum probability isThe corresponding log-linear model of the present invention is:
Wherein, λmFunction weight is characterized, is rule of thumb manually set with actual scene;For following spy
Levy function:
(1) typing model probability;
(2) probabilistic language model;
(3) probability of occurrence of the word in input method phrase candidate;
(4) input method phrase candidate probability of occurrence;
(5) word in input method phrase candidate whether the binary feature in machine translation translation candidate;
(6) input method phrase candidate whether the binary feature in machine translation translation candidate;
(7) input method phrase candidate whether the binary feature in user's terminology bank.
Feature (1)-(4) can pass through following seed words library initialization:
http://www.datatang.com/data/45925
Chinese-character phonetic letter table can be downloaded by following address:
http://www.datatang.com/data/11858
Step S33:Complete in the input keystroke sequence after described point of word after the calculating of all coding units, the input
The length of method phrase candidate list be M, and by marking descending arrangement, wherein M for it is set in advance be more than zero positive integer.This example
In, M value is 5, be can customize.
Phrase candidate list is shown in second row at input unit interface, and every page shows 5, and numbering is 0 to 4, space bar
The candidate that selection numbering is 0, operating key (Ctrl) selection is encoded to 1 candidate, the selection correspondence candidate of numerical key 0 to 4.“z’g’
The corresponding results of k ' l " are as shown in Figure 5.
4th, response user key-press is selected after multi-component grammar hint phrase or input method phrase candidate, utilizes typing translation
Part and the machine translation translation candidate list generate N number of multi-component grammar hint phrase, and are output to input unit interface, etc.
Treat that user key-press is selected, repeat the above steps 3, until user completes the translation of typing source language sentence.
Step S41:Respond after user key-press selection multi-component grammar hint phrase or input method phrase candidate, to typing
The handy above-mentioned steps 1 in translation part carry out participle and obtain the translation of the typing part after participle.
Step S42:If the optimal machine translation translation includes last of the translation of the typing part after participle
Word, then using maximum-prefix matching algorithm, calculate the translation of the typing part after optimal machine translation translation candidate and participle,
Generate N number of multi-component grammar hint phrase.
In this example, after user's input " welfare ", it is prefix matching success with " welfare ", generates new round N-gram
Prompting and tool sequence number:5. system.
Step S43:If the optimal machine translation translation does not include last of the translation of the typing part after participle
Individual word, then select all last words for including the translation of the typing part after participle in machine translation translation candidate list
Machine translation translation candidate, obtain suboptimum machine translation translation candidate list, and the highest machine translation that will wherein give a mark is translated
Literary candidate is used as suboptimum machine translation translation;Using prefix match algorithm, after suboptimum machine translation translation candidate and participle
Typing translation part is calculated, generates N number of multi-component grammar hint phrase.
It can be disabled according to actual conditions or enable multi-component grammar hint phrase, Fig. 4 is with having illustrated two kinds of situations
Contrast.In Fig. 4, left figure is the situation of disabling multi-component grammar hint phrase, and right figure is the feelings for enabling multi-component grammar hint phrase
Shape.
The input method for the above-mentioned computer-oriented supplementary translation that the present invention is provided realized by computer software,
Accordingly, the invention also provides a kind of input unit of computer-oriented supplementary translation, be illustrated in figure 2 the present invention towards
The system framework figure of the input unit of computer-aided translation, input unit of the invention includes:Word-dividing mode, translation module,
First generation module, the second generation module, input unit interface, wherein:
Word-dividing mode, for typing translation part to generate and exports the source language sentence after participle by source language sentence and
With the translation of the typing part after participle, method shown in the step 1 in the input method of the invention of above-mentioned introduction can be passed through
All kinds of participle instruments including Urheen are called to carry out participle;
Translation module is connected with word-dividing mode, using MT engine, obtains the source language sentence pair after participle
The machine translation translation candidate list answered, and the highest machine translation translation candidate that will wherein give a mark translates as optimal machine translation
Text is output to the module at input unit interface;
First generation module is connected with translation module, input unit interface, for machine translation translation candidate list and
Input keystroke sequence and carry out the method calculating as shown in above-mentioned step 2, utilize log-linear model, M input method phrase of generation is waited
Select and be output to input unit interface;
Second generation module is connected with translation module, input unit interface, for being turned over to typing translation part and machine
Translation candidate list carries out the method as shown in above-mentioned step 3 and calculated, and generates N number of multi-component grammar hint phrase and is output to input
Device interface;
Input unit interface, for showing that optimal machine translation translation, input method phrase candidate and multi-component grammar prompting are short
Language, and receive user key-press select command and input keystroke sequence, the translation of typing source language sentence.
As a preferred embodiment of the present invention, the invention also provides a kind of input of computer-oriented supplementary translation
Device, including:
The device of participle is carried out to source language sentence, can be by the step in the input method of the invention of above-mentioned introduction
All kinds of participle instruments of the method call shown in 1 including Urheen carry out participle;
Using MT engine, the corresponding object language machine translation translation of the source language sentence after participle is obtained
Candidate list, the highest that will wherein give a mark machine translation translation candidate generation phrase candidate list, and it is output to input unit circle
The device in face;Described device can obtain machine translation candidate list, i.e. n-best lists by method shown in above-mentioned steps 2;
After the keystroke sequence for receiving user's input, using log-linear model, with reference to machine translation translation candidate row
Table, the phrase candidate list of dynamic adjustment in real time and the device for being output to the input unit interface;
User key-press selection is responded, until user completes the device of source language sentence translation.
Preferably, the input unit of computer-oriented supplementary translation of the invention also includes:When one phrase of user's typing
The device of N-gram prompting is obtained with reference to machine translation candidate list afterwards;And show that the N-gram is carried in interface of input method
Show, the device selected for user.
As a preferred embodiment of the present invention, the invention also provides a kind of input of computer-oriented supplementary translation
Device, visualized graph interface as shown in figure 1, including:
The device of participle is carried out to source language sentence;
Using MT engine, the corresponding machine translation translation candidate row of the source language sentence after participle are obtained
Table, and will wherein give a mark highest machine translation translation candidate as optimal machine translation translation be output to input unit interface,
N number of multi-component grammar hint phrase is generated using the top n word of the optimal machine translation translation, and is output to input unit circle
Face, waits the device of user key-press selection;
To user key-press select multi-component grammar hint phrase respond, or reception user input keystroke sequence,
The machine translation translation candidate list and input keystroke sequence are calculated using log-linear model, M input method of generation is short
Language candidate is simultaneously output to input unit interface, waits the device of user key-press selection;
The input method phrase candidate that user key-press is selected is responded, or receives the input of user by bond order
Row, are judged whether user has completed the translation of typing source language sentence, if it is terminated, if otherwise translated using typing
Literary part and the machine translation translation candidate list generate N number of multi-component grammar hint phrase, are output to input unit interface, etc.
Treat that user key-press selects and circulated the device for performing above-mentioned response of step;
Wherein, N, M are positive integer.
5th, Setup Experiments
In order to verify whether the present invention can increase considerably translation efficiency, from the auxiliary translation system of privately owned assistance translation platform member
(http://cotrans.me) in randomly selected comprising 4,040 to translation daily record, and be randomly divided into two groups, often
Group is right comprising 2020.Every group is randomized into development set (1,000 to) and test set (1,020 to) again.The auxiliary translation of member
Machine translation system in system is that phrase-based translation model is realized.The ginseng free ZMERT that increases income is adjusted, can be by following
Download address:
http://joshua-decoder.org/4.0/zmert.html
Be set to evaluation and test index parameter during development set tune ginseng "-m BLEU4 shortest " (for example, see Papineni,
Kishore., Roukos, Salim, Ward, Todd, and Zhu Wei-Jing, " BLEU:a method for automatic
Evaluation of machine translation ", In Proc.of ACL, 2002).Baseline system is Google's cloud translation
Input method, can pass through following links and accesses:
http://www.google.com/inputtools/try/
The evaluation index used is button saving rate (keystroke savings rate, KSR).Because of different translation systems
The translation number of candidates of output may be inconsistent, in order to avoid this species diversity, and this experiment is only used each source language sentence
Highest is divided to translate candidate as reference, calculation formula is as follows:
Google's cloud translating input method:
The present invention:
Wherein, T is the artificial translation sentence set of Chinese, and C is the corresponding optimal machine translation collection of translations of all english sentences
Close,For Chinese artificial translation, m is the number of the artificial translation word, and c is that optimal machine translation is translated
Text.mknorm(t) represent if with the spelling input method minimum number of stroking that word for word the Chinese sentence t of typing needs;Mk (t) is represented
In the case that machine translation translation is consistent with artificial translation, the minimum number of stroking that artificial translation t needs is inputted using the present invention;
kGoogle(t) represent to input the number of actually stroking that artificial translation t needs using Google's cloud input method;K (c, t) represents to refer to machine
Translation c is translated, the number of actually stroking needed with the Chinese sentence t of present invention input.For Chinese, reference literature Wei Cui,
" Evaluation of Chinese Character Keyboards ", Computer, 18 (1), pp.54-59,1985, just like
Lower formula:
Wherein, len (ti) it is word tiChinese character number.Mk (t) value can be calculated by equation below:
Wherein, N represents the number of N-gram prompting, is defaulted as 4;Sl represents the number of separator between word and word, such as right
In Chinese sl=0, for English sl=1;Sp represents to select that some word needs by bond number, normal conditions from input method result
Under, sp=1.
The value of button saving rate is the decimal between 0 to 1, and 0 represents that button can not be saved completely, and 1 represents to reach preferable shape
State, button can not be reduced again.
6th, experimental result
Table 1 gives the present invention and performance of Google's cloud input method in two groups of test datas.It will be seen that this hair
Bright button saving rate has been respectively increased 11.04%, 11.26% relative to Google's cloud input method in two groups of test datas.This
The validity and superiority of the input method of computer-oriented supplementary translation are absolutely proved.
In a word, test result indicates that the input method and device of the computer-oriented supplementary translation of the present invention can be fully effective
Using machine translation knowledge, the input speed and translation efficiency of professional interpreter can be greatly improved.
The button saving rate (%) of the present invention of table 1 and Google's input method
Experimental group | Google's cloud input method | The present invention |
1 | 37.40 | 48.44 |
2 | 36.44 | 47.70 |
Because the method for the present invention is not proposed for two kinds of specific language, so methods and apparatus of the present invention
With universal applicability.Although the present invention is only translated in English to Chinese and tested on direction and spelling input method,
The present invention is also applied for other Languages pair and other character input methods, such as Chinese to English, English to French Translator direction simultaneously
With five-stroke input method etc..
Particular embodiments described above, has been carried out further in detail to the purpose of the present invention, technical scheme and beneficial effect
Describe in detail bright, it should be understood that the foregoing is only the present invention specific embodiment, be not intended to limit the invention, it is all
Within the spirit and principles in the present invention, any modification, equivalent substitution and improvements done etc. should be included in the protection of the present invention
Within the scope of.
Claims (6)
1. a kind of input method of computer-oriented supplementary translation, comprises the following steps:
Step S1:Participle is carried out to source language sentence;
Step S2:Using MT engine, the corresponding machine translation translation candidate of the source language sentence after participle is obtained
List, and the highest machine translation translation candidate that will wherein give a mark is output to input unit circle as optimal machine translation translation
Face;Using the initial N number of multi-component grammar hint phrase of the top n word generation of the optimal machine translation translation, and it is output to defeated
Enter device interface, wait user key-press selection;Wherein, N number of multi-component grammar hint phrase is carrying for continuous multiple word compositions
Show phrase, the hint phrase includes:First hint phrase is unigram, only comprising a word;Second hint phrase
For bi-gram, comprising two words, described two words contain the word and second cue of first hint phrase, and first
The word of hint phrase is the prefix of second hint phrase;By that analogy, all words of the N-1 hint phrase are that n-th is carried
Show the prefix of phrase, n-th hint phrase is that N-gram includes N number of word;
Step S3:The multi-component grammar hint phrase that user key-press is selected is responded, or receives the input of user by bond order
Row;Using log-linear model, the machine translation translation candidate list and input keystroke sequence are calculated, M input is generated
Method phrase candidate is simultaneously output to input unit interface, waits user key-press selection;
Step S4:The input method phrase candidate that user key-press is selected is responded, or receives the input button of user
Sequence, judges whether user has completed the translation of typing source language sentence, if it is terminates, if otherwise utilizing typing
Translation part and the machine translation translation candidate list carry out N number of multi-component grammar prompting after maximum-prefix matching generation updates
Phrase, is output to input unit interface, waits user key-press selection, and jump to step S3;
Wherein, N, M are positive integer set in advance.
2. the input method of computer-oriented supplementary translation according to claim 1, it is characterised in that the utilization logarithm
Linear model, calculates machine translation translation candidate list and input keystroke sequence, generates M input method phrase candidate, including
Following steps:
Step S31:A point word is carried out to input keystroke sequence, the input keystroke sequence after point word is obtained;Input after described point of word
The coding unit that keystroke sequence is separated by point character is constituted, and each coding unit is the whole of the character input method coding of correspondence word
Or the prefix of character input method coding;
Step S32:Input method phrase candidate list is initialized as sky, to each in the input keystroke sequence after described point of word
Coding unit is calculated as below successively:
According to character input method coding rule, the coding unit is calculated and obtains target word candidate collection;
Using decoding algorithm to the target word candidate collection, input method phrase candidate list and machine translation translation candidate list
Calculate, obtain new input method phrase candidate list;
Each input method phrase candidate in the new input method phrase candidate list is carried out using log-linear model
Give a mark and arrange in descending order;If the length of the new input method phrase candidate list exceedes the threshold value M of setting, only retain
Preceding M marking highest input method phrase candidate;The number for the target word candidate that each input method phrase candidate includes is equal to
The coding unit number of decoding, order and the decoded coding of the effective candidate of target word that each input method phrase candidate includes
Sequence of unit is consistent;
The input method phrase candidate list is substituted with the new input method phrase candidate list;
Step S33:Complete in the input keystroke sequence after described point of word after the calculating of all coding units, the input method is short
The length of language candidate list be M, and by marking descending arrangement, wherein M for it is set in advance be more than zero positive integer.
3. the input method of computer-oriented supplementary translation according to claim 2, it is characterised in that the log-linear
The feature that model is used includes:
(1) typing model probability;
(2) probabilistic language model;
(3) probability of occurrence of the word in input method phrase candidate;
(4) input method phrase candidate probability of occurrence;
(5) word in input method phrase candidate whether the binary feature in machine translation translation candidate;
(6) input method phrase candidate whether the binary feature in machine translation translation candidate;
(7) input method phrase candidate whether the binary feature in user's terminology bank.
4. the input method of computer-oriented supplementary translation according to claim 1, it is characterised in that described utilize has been recorded
Enter translation part and the machine translation translation candidate list carries out N number of multi-component grammar after maximum-prefix matching generation updates and carried
The step of showing phrase, specifically include following steps:
Step S41:Respond after user key-press selection multi-component grammar hint phrase or input method phrase candidate, to typing translation
Part carries out participle and obtains the translation of the typing part after participle;
Step S42:If the optimal machine translation translation includes last word of the translation of the typing part after participle,
Using maximum-prefix matching algorithm, the translation of the typing part after optimal machine translation translation candidate and participle is calculated, generation
N number of multi-component grammar hint phrase after renewal;
Step S43:If the optimal machine translation translation does not include last word of the translation of the typing part after participle,
The machine of all last words for including the translation of the typing part after participle is then selected in machine translation translation candidate list
Device translates translation candidate, obtains suboptimum machine translation translation candidate list, and the highest machine translation translation time that will wherein give a mark
It is elected to be as suboptimum machine translation translation;Using prefix match algorithm, to the record after suboptimum machine translation translation candidate and participle
Enter the calculating of translation part, N number of multi-component grammar hint phrase after generation renewal.
5. a kind of computer-oriented supplementary translation of the input method of the computer-oriented supplementary translation described in usage right requirement 1
Input unit, it is characterised in that the device includes:Word-dividing mode, translation module, the first generation module, the second generation module,
Input unit interface, wherein:
Word-dividing mode, generated for by source language sentence and typing translation part and export the source language sentence after participle and point
The translation of typing part after word;
Translation module is connected with word-dividing mode, using MT engine, obtains the source language sentence after participle corresponding
Machine translation translation candidate list, and the highest machine translation translation candidate that will wherein give a mark is defeated as optimal machine translation translation
Go out the module to input unit interface;
First generation module is connected with translation module, input unit interface, for machine translation translation candidate list and input
Keystroke sequence is calculated, using log-linear model, is generated M input method phrase candidate and is output to input unit interface;
Second generation module is connected with translation module, input unit interface, for translating typing translation part and machine translation
Literary candidate list is calculated, and is carried out N number of multi-component grammar hint phrase after maximum-prefix matching generation updates and is output to input dress
Put interface;
Input unit interface, for showing optimal machine translation translation, input method phrase candidate and multi-component grammar hint phrase, and
Receive user key-press select command and input keystroke sequence, the translation of typing source language sentence.
6. a kind of input unit of computer-oriented supplementary translation, including:
The device of participle is carried out to source language sentence;
Using MT engine, the corresponding object language machine translation translation candidate of the source language sentence after participle is obtained
List, and the highest machine translation translation candidate that will wherein give a mark is output to input unit circle as optimal machine translation translation
Face, using the top n word of the optimal machine translation translation initial N number of multi-component grammar hint phrase is generated, and be output to defeated
Enter device interface, wait the device of user key-press selection;Wherein, N number of multi-component grammar hint phrase is continuous multiple phrases
Into hint phrase, the hint phrase includes:First hint phrase is unigram, only comprising a word;Second carries
It is bi-gram to show phrase, and comprising two words, described two words contain the word and second cue of first hint phrase, and
The word of first hint phrase is the prefix of second hint phrase;By that analogy, all words of the N-1 hint phrase are
The prefix of N number of hint phrase, n-th hint phrase is that N-gram includes N number of word;
The multi-component grammar hint phrase that user key-press is selected is responded, or receives the input keystroke sequence of user, is utilized
Log-linear model is calculated the machine translation translation candidate list and input keystroke sequence, and M input method phrase of generation is waited
Select and be output to input unit interface, wait the device of user key-press selection;
The input method phrase candidate that user key-press is selected is responded, or receives the input keystroke sequence of user, is sentenced
Whether disconnected user has completed the translation of typing source language sentence, if it is terminates, if otherwise utilizing typing translation portion
Divide and the machine translation translation candidate list generates N number of multi-component grammar hint phrase after updating, be output to input unit circle
Face, waits user key-press to select and circulate the device for performing above-mentioned response of step;
Wherein, N, M are positive integer set in advance.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410678005.XA CN104462072B (en) | 2014-11-21 | 2014-11-21 | The input method and device of computer-oriented supplementary translation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410678005.XA CN104462072B (en) | 2014-11-21 | 2014-11-21 | The input method and device of computer-oriented supplementary translation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104462072A CN104462072A (en) | 2015-03-25 |
CN104462072B true CN104462072B (en) | 2017-09-26 |
Family
ID=52908138
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410678005.XA Active CN104462072B (en) | 2014-11-21 | 2014-11-21 | The input method and device of computer-oriented supplementary translation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104462072B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108920472A (en) * | 2018-07-04 | 2018-11-30 | 哈尔滨工业大学 | A kind of emerging system and method for the machine translation system based on deep learning |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105069000A (en) * | 2015-08-24 | 2015-11-18 | 中译语通科技(北京)有限公司 | Interactive prediction input method |
CN107870900B (en) * | 2016-09-27 | 2023-04-18 | 松下知识产权经营株式会社 | Method, apparatus and recording medium for providing translated text |
CN106649293A (en) * | 2016-12-28 | 2017-05-10 | 语联网(武汉)信息技术有限公司 | Translation method and translation system |
CN107123318B (en) * | 2017-03-30 | 2020-05-08 | 河南工学院 | Foreign language writing learning system based on input method device |
CN107885729B (en) * | 2017-09-25 | 2021-05-11 | 沈阳航空航天大学 | Interactive machine translation method based on bilingual fragments |
CN108829686B (en) * | 2018-05-30 | 2022-04-15 | 北京小米移动软件有限公司 | Translation information display method, device, equipment and storage medium |
US11328132B2 (en) * | 2019-09-09 | 2022-05-10 | International Business Machines Corporation | Translation engine suggestion via targeted probes |
CN111090460B (en) * | 2019-10-12 | 2021-05-04 | 浙江大学 | Code change log automatic generation method based on nearest neighbor algorithm |
CN111079449B (en) * | 2019-12-19 | 2023-04-11 | 北京百度网讯科技有限公司 | Method and device for acquiring parallel corpus data, electronic equipment and storage medium |
CN111339788B (en) | 2020-02-18 | 2023-09-15 | 北京字节跳动网络技术有限公司 | Interactive machine translation method, device, equipment and medium |
CN111597826B (en) * | 2020-05-15 | 2021-10-01 | 苏州七星天专利运营管理有限责任公司 | Method for processing terms in auxiliary translation |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102253930A (en) * | 2010-05-18 | 2011-11-23 | 腾讯科技(深圳)有限公司 | Method and device for translating text |
CN102662933A (en) * | 2012-03-28 | 2012-09-12 | 成都优译信息技术有限公司 | Distributive intelligent translation method |
CN103955457A (en) * | 2014-05-20 | 2014-07-30 | 陈北宗 | Machine-aided literature translation program |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8843359B2 (en) * | 2009-02-27 | 2014-09-23 | Andrew Nelthropp Lauder | Language translation employing a combination of machine and human translations |
-
2014
- 2014-11-21 CN CN201410678005.XA patent/CN104462072B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102253930A (en) * | 2010-05-18 | 2011-11-23 | 腾讯科技(深圳)有限公司 | Method and device for translating text |
CN102662933A (en) * | 2012-03-28 | 2012-09-12 | 成都优译信息技术有限公司 | Distributive intelligent translation method |
CN103955457A (en) * | 2014-05-20 | 2014-07-30 | 陈北宗 | Machine-aided literature translation program |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108920472A (en) * | 2018-07-04 | 2018-11-30 | 哈尔滨工业大学 | A kind of emerging system and method for the machine translation system based on deep learning |
Also Published As
Publication number | Publication date |
---|---|
CN104462072A (en) | 2015-03-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104462072B (en) | The input method and device of computer-oriented supplementary translation | |
Vogel et al. | The CMU statistical machine translation system | |
US20080040095A1 (en) | System for Multiligual Machine Translation from English to Hindi and Other Indian Languages Using Pseudo-Interlingua and Hybridized Approach | |
Mauser et al. | Extending statistical machine translation with discriminative and trigger-based lexicon models | |
US20050216253A1 (en) | System and method for reverse transliteration using statistical alignment | |
Wu et al. | Inversion transduction grammar constraints for mining parallel sentences from quasi-comparable corpora | |
CN105068997B (en) | The construction method and device of parallel corpora | |
JP2000353161A (en) | Method and device for controlling style in generation of natural language | |
CN105573994B (en) | Statictic machine translation system based on syntax skeleton | |
KR102043353B1 (en) | Apparatus and method for recognizing Korean named entity using deep-learning | |
KR100911372B1 (en) | Apparatus and method for unsupervised learning translation relationships among words and phrases in the statistical machine translation system | |
CN106156013A (en) | The two-part machine translation method that a kind of regular collocation type phrase is preferential | |
Tomás et al. | Statistical phrase-based models for interactive computer-assisted translation | |
Weerasinghe | A statistical machine translation approach to sinhala-tamil language translation | |
Slayden et al. | Thai sentence-breaking for large-scale SMT | |
Ney et al. | Improving word alignment quality using morpho-syntactic information | |
Sin et al. | Attention-based syllable level neural machine translation system for myanmar to english language pair | |
Alabau et al. | Multimodal interactive machine translation | |
JP2005506635A (en) | Computer controlled coder / decoder not limited by language or method | |
Devi et al. | Steps of pre-processing for english to mizo smt system | |
JP2013186673A (en) | Machine translation device and machine translation program | |
Dasgupta et al. | Resource creation and development of an English-Bangla back transliteration system | |
Braune et al. | Rule selection with soft syntactic features for string-to-tree statistical machine translation | |
WO2024004183A1 (en) | Extraction device, generation device, extraction method, generation method, and program | |
WO2024004184A1 (en) | Generation device, generation method, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |