A kind of apparatus and method for improving mechanical translation quality
Technical field
The present invention relates to machine translation mothod, more particularly to a kind of apparatus and method for improving mechanical translation quality.
Background technology
With the development of internet, the people of countries in the world different language neutralize online exchange also increasingly in reality
It is many, more and more closely.Various machine translation tools are generated thereupon.Machine translation realizes different language by using computer
Between translation.Although the quality of machine translation is being improved constantly, till now, machine translation still can not be substituted
Human translation, or even the situation that the sentence of translation allows user to fail to understand occurs.
Existing mechanical translation quality inspection method is mainly used for carrying out the evaluation and test of mechanical translation quality, is, for example, and leads to
The output of contrast machine translation and the output of human translation are crossed, a digital value is finally calculated, is evaluated with this digital value
The quality of machine translation.The flow of this mechanical translation quality evaluation and test is the output translation based on machine translation system with referring in advance
String matching between fixed reference translation, that is, the character string occurred in output translation is searched in reference translation.
The processing mode of character string for matching has many kinds, and the method based on N-gram (N metagrammars) co-occurrence is current machine
Translate the main method in automatic judgment technology, such as BLEU (Bilingual Evaluation Understudy) and NIST (US
National Institute of Standards and Technology).This method needs multiple translator's independences
Same source language text information is translated into target language text information, moreover, it is more rationally accurate in order to evaluate, generally require
Source language text information to equivalent length is evaluated, and provides comprehensive evaluation result.This method is suitable to machine translation
Evaluation and test and match, but for domestic consumer, the evaluation result of the mechanical translation quality represented with digital value perhaps can not allow
He intuitively understands.Such as, when a Japanese uses machine translation tools, he wants to translate into Japanese source language text information
Chinese text information, the evaluating system of machine translation tells him the BLEU values of translation result are 0.3, but he still can not judge
Overall translation quality, does not know it is the of poor quality of which section text message translation actually yet.
Moreover, this evaluation and test is due to the translation result dependent on human translation person, cost is higher.Moreover, this evaluation and test
It is for given original language information, it is impossible to which the translation quality to the original language information of the real-time input of user is evaluated and tested.
In addition, in existing machine translation system, even if the object language result that user is aware of translation is bad, typically
Also have no idea to be efficiently modified the object language result that machine translation is exported.
The content of the invention
In order to solve the problem of machine translation effect is undesirable in the prior art, the present invention provides a kind of use in one aspect
In the device for improving mechanical translation quality, the device includes:Original language input module, for allowing user to input the source language to be translated
Speech, and show the original language of user's input;Machine translation module, for the source language for inputting user in original language input module
Speech translates into object language;Module is presented in machine translation result, for the target language after the translation of machine translation module to be presented
Speech;Machine translation result checks module, for the object language after the translation of machine translation module, translating into and original language
Languages identical contrasts language;Machine translation result checks display module, and display machine translation result is checked after module translation
Language is contrasted with original language languages identical.
In the device of foregoing raising mechanical translation quality, in addition to original language editor module, for provide with it is described
The related information of the original language to be translated so that user can select and change some of which information related to original language,
Translated for machine translation module references.
In the device of foregoing raising mechanical translation quality, it further comprises that module is presented in original language feedback information,
The used information related to the original language to be translated during for the translation of machine translation module to be presented.
In the device of foregoing raising mechanical translation quality, wherein described related to the original language to be translated
Information is provided by way of the option that can be selected.It is alternatively wherein described with the original language to be translated
Related information is embodied by the form of structure tree, and the structure tree can be changed by way of dilatory, click.
The present invention is improved in the device of mechanical translation quality, wherein the packet related to the original language to be translated
Include at least one of participle information, morphological information and syntactic information of original language.
In the device of above-mentioned raising mechanical translation quality, wherein described participle is the mode by using symbol by source
Language is divided into multiple language message units.
In the device of foregoing raising mechanical translation quality, wherein described symbol has comma, space, tiltedly line.
In the device of above-mentioned raising mechanical translation quality, wherein described original language input module, machine translation knot
Module is presented in fruit and machine translation result checks that display module is input area or the viewing area on Html or Java webpages
Input area or viewing area that application software in domain, or computer or single-chip microcomputer is produced.
In the device of foregoing raising mechanical translation quality, wherein described machine translation module by it is following at least
A kind of mode is handled the original language to be translated:To the participle of original language information, original language information is divided into multiple
Language message unit;Part of speech analysis to original language information, the part of speech of multiple language message units of original language information is carried out
Analysis;And the syntactic analysis to original language information, the grammer between multiple language message units of original language information is carried out
Analysis.
In the device of above-mentioned raising mechanical translation quality, wherein described machine translation result checks that module is additionally operable to
The original language that the input module is inputted and the machine translation result check that the contrast language after module translation is compared,
And calculate the fraction of a similarity.
In the device of foregoing raising mechanical translation quality, wherein described machine translation result checks display module also
The similarity score is shown, is referred to for user.
The present invention provides a kind of device for being used to improve mechanical translation quality on the other hand, and it includes:Original language is inputted
Module, for allowing user to input the original language to be translated, and shows the original language of user's input;Machine translation module, for inciting somebody to action
The source language translation that user inputs in original language input module is into object language;Machine translation result present module, in
Now by machine translation module translation after object language;Original language editor module, for providing and the source language to be translated
Say related information so that user can select and change some of which information related to original language, for machine translation mould
Block refers to be translated.
In the device of foregoing raising mechanical translation quality, further comprise that module is presented in original language feedback information, use
The used information related to the original language the to be translated when translation of machine translation module is presented.
In the device of above-mentioned raising mechanical translation quality, wherein the packet related to the original language to be translated
Include at least one of participle information, morphological information and syntactic information of original language.
In the device of above-mentioned raising mechanical translation quality, it includes:Machine translation result checks module, for warp
The object language crossed after the translation of machine translation module, translates into and contrasts language with original language languages identical;Machine translation result
Display module is checked, display machine translation result is checked contrasts language after module translation with original language languages identical.
In the device of above-mentioned raising mechanical translation quality, wherein described machine translation module by it is following at least
A kind of mode is handled the original language to be translated:To the participle of original language information, original language information is divided into multiple
Language message unit;Part of speech analysis to original language information, the part of speech of multiple language message units of original language information is carried out
Analysis;And the syntactic analysis to original language information, the grammer between multiple language message units of original language information is carried out
Analysis.
In the device of above-mentioned raising mechanical translation quality, wherein described related to the original language to be translated
Information is provided by way of option.Alternatively, wherein the letter related to the original language to be translated
Breath is embodied by the form of structure tree, and the structure tree can be changed by way of dilatory, click.
In the device of above-mentioned raising mechanical translation quality, wherein the information related to the original language to be translated
At least one of participle information, morphological information and syntactic information including original language.
In the device of above-mentioned raising mechanical translation quality, wherein described participle is the mode by using symbol by source
Language is divided into multiple language message units.
In the device of foregoing raising mechanical translation quality, wherein described symbol has comma, space, tiltedly line.
In the device of above-mentioned raising mechanical translation quality, wherein described original language input module, machine translation knot
Module is presented in fruit and machine translation result checks that display module is input area or the viewing area on Html or Java webpages
Input area or viewing area that application software in domain, or computer or single-chip microcomputer is produced.
In the device of above-mentioned raising mechanical translation quality, described machine translation result checks that module is additionally operable to institute
The original language and the machine translation result for stating input module input check that the contrast language after module translation is compared, and count
Calculate the fraction of a similarity.
In the device of above-mentioned raising mechanical translation quality, wherein described machine translation result checks display module also
The similarity score is shown, is referred to for user.
Brief description of the drawings
Fig. 1 show the embodiment for the device that mechanical translation quality is improved according to the present invention.
Fig. 2 show the change case of the embodiment for the device that mechanical translation quality is improved according to the present invention.
Fig. 3 show when the device for improving mechanical translation quality according to the present invention is translated and the part of speech of original language is carried out
The example user interface of editor.
Fig. 4 show when the device for improving mechanical translation quality according to the present invention is translated and the part of speech of original language is carried out
Another example of the user interface of editor.
Fig. 5 a are shown is used for the syntax tree for the sentence that computer is analyzed according to the present invention.
Fig. 5 b show the sentence tree construction according to the present invention.
Fig. 6 show the working-flow figure for being used to improve mechanical translation quality according to the present invention.
Embodiment
For the object, technical solutions and advantages of the present invention are more clearly understood, embodiment and referring to the drawings is exemplified below,
The present invention is described in more detail.
Fig. 1 show the device of raising mechanical translation quality according to a first embodiment of the present invention.The device includes:For
Input the original language input module 1 of simultaneously presence source language message;The original language information that user is inputted in original language input module 1
The machine translation module 2 of target-language information is translated into, the machine translation module 2 can for example be presented as computer or monolithic
A process of CPU on machine;It is in for the machine translation result of target-language information after machine translation module 2 is translated to be presented
Existing module 3;The target-language information that machine translation module 2 is translated is translated into for being contrasted with foregoing original language again
Original language comparative information machine translation result check module 4;Checked with for machine translation result to be presented after the translation of module 4
Be used for contrast original language comparative information the inspection of machine translation result present module 5.Original language input module 1, machine are turned over
It for example can be defeated on html or Java webpages to translate result and module 3 and/or machine translation result inspection presentation module 5 is presented
Enter either/and viewing area or computer or single-chip microcomputer in application software produce input or/and viewing area.Touch
Send out foregoing translation module 2 and machine translation check module 4 interpretative function for machine translation trigger module/button 10 and machine
Translation checks trigger module/button 11, they can for example be presented as two on webpage or in stand-alone program it is separated by
Key, or the other manner realization of the two functions can be distinguished.There can be the printed words of " translation " on button 10, and on button 11
There can be the printed words of " inspection ".
It is also embodied in the effect for improving the machine translation module 2 in the device of translation quality of the embodiment of the present invention
In but being not limited to, analyzed the part of speech of the vocabulary of original language, the sentence of original language make pauses in reading unpunctuated ancient writings/participle analysis, grammer
Analysis and/or morphological analysis.Such as, when original language is " she is liked kimonos inside big bag ", machine translation module 2
The original language information is divided into multiple language message units:She, likes, kimonos, dress, greatly, sack, the inside, for described
Used in the translation process of machine translation module 2.During the participle is carried out, except above-mentioned word segmentation processing is (i.e. to wanting
The original language information of translation is made pauses in reading unpunctuated ancient writings), machine translation module 2 is additionally operable to the part of speech analysis to original language information, by original language
The part of speech and grammer of multiple language message units of information are analyzed, then are translated.
And machine translation result check module 4 also have with as machine translation module class part of speech analysis, sentence punctuate/
Participle analysis and/or morphological analysis etc. some of which or repertoire, but the language of translation is opposite.Herein no longer
Repeat.
In the embodiment of a change, described machine translation result checks that module 4 can be also used for described defeated
The original language information and the machine translation result for entering the input of module 1 check that the original language information of the contrast after the translation of module 4 is entered
Row compares, and calculates the fraction of a similarity.The computational methods of the fraction of the similarity can utilize known algorithm, such as
BLEU and NIST methods based on n-gram (N-gram).The similarity score can be shown by machine translation result inspection
Module 5 is shown, for reference.
In the embodiment of a change, as shown in Fig. 2 on the basis of the device of foregoing raising mechanical translation quality
Also include original language feedback information and module 6 is presented, used for machine translation module 2 to be presented when translating the original language information
Participle information, morphological information and syntactic information etc., it is for reference.According to actual application, foregoing point can be selected
In word information, morphological information and syntactic information one or several kinds or be all presented on original language feedback information present module 6
In.A html input frame is shown as module 6 is presented in original language feedback information, the input frame can show machine translation module
The word segmentation results of 2 generations " she like kimonos inside the big bag ", being separated by space between word and word.Can
Replace, can also otherwise carry out participle, such as with comma, " or "/", etc..If original language is that " she likes
Kimonos inside big bag." machine translation English be " She likes to put inside the bag and
The inspection result of clothing in large. " machine translation is " she is liked inner bag and large-scale clothes." although user is ignorant of
English, but still can judge, machine does not translate " kimonos " this word.The word segmentation result provided by the system shows
Show interface, user it will be clear that wrong original language word segmentation result " she is liked kimonos inside big bag." so
Even if user is ignorant of English, but he still can confirm that the participle mistake that machine translation occurs.
But it is due to that in actual applications, described vocabulary segmentation result might have different results.Such as above-mentioned sentence
Vocabulary in son is possible to be divided into:She, likes, and, clothes, greatly, sack, the inside.When user is anti-in original language
When seeing the feedback of this participle information in feedforward information presentation module 6, he/her can be concluded that this translation probably occurs
Mistake.In order to correct such participle, there can also be an original language in the device of foregoing raising mechanical translation quality
Information editing's module 7, the information for editing the display of original language feedback information display module 6 for user.Such as, user word it
Between add space or comma, " or "/" editor after information be input to machine translation module 2 and translated.To original language
Entering edlin is carried out in original language input module 1.In the realization of another graphic interface, each original language
Phrase is all shown on a button, and interval is shown between button to represent participle;Being clicked on mouse can separate and merge
Button, so as to reach the purpose of editor's participle.As shown " clothes " on a button, in the middle of two words of double mouse click " clothes "
Position, this button be split on two buttons, button respectively show " clothes " and " dress ";And for example " and " and " clothes " are respectively
It is shown in two buttons, pins Ctrl and do not discharge, then successively clicked with left mouse button " and " button and " clothes " button, " and "
Button and " clothes " button, which are just merged on a button, button, shows that " kimonos " represents to merge into a phrase.
He can correct this participle mistake by editing interface, and such as correction is " she likes kimonos to be mounted in big bag
Face.", i.e., " kimonos " is merged into a word.Then translated again, machine translation module can just utilize point newly inputted
Information is analysed, translation result correction is " She likes the kimono packed in big bags inside. ".Machine is turned over
The inspection result translated is " she likes the kimonos inside sack packaging." user can be clearly seen that the quality of machine translation
Improve.
User can also carry out the editor of other manner, for example, original language is Chinese, object language is English.Original language
" I is that county magistrate sends." machine translation English be " I am a magistrate sent ".Assuming that user and being ignorant of English
Text.But user clicks on " inspection " button (machine translation result checks trigger module 11), triggering machine translation result checks module
4, machine translation result checks that module 4 will " I am a magistrate sent " be translated and shown result for " I is to cut out
Judge in Hades sends." in the inspection result presentation module 5 of translation result, such user is known that machine probably without correctly
Translate original language.In order to correct the result of translation, user can change expression way, and original language is such as changed to " I is by county magistrate
Send." machine translation English for " inspection result of I was sent by county head. " machine translation is " I am
Sent by county magistrate." so, user can check again for the result of this returning.The returning result checked when user's determination
It is substantially consistent with original language, it just can confirm that the English results contrast of machine translation is reliable.
In addition to the edit methods of above-mentioned original language, user can also edit the part of speech of original language.Fig. 3 show root
The example user interface of edlin is entered to the part of speech of original language when the device for improving mechanical translation quality according to the present invention is translated.
As shown in figure 3, one of which implementation is:Each word is shown below position below word segmentation result, each word
Part of speech.User can select correct part of speech in part of speech analysis result in option, can also pass through " the new part of speech of increase "
Button increases not shown part of speech to word, as shown in Figure 4.If original language is that " Huis gets full marks." user can pass through part of speech
Editor module, is measure word by " Huis " selection.Then rerun machine translation, the result of machine translation is " Get full
The inspection result of marks every time. " machine translation is " obtaining full marks every time " (not shown).User is known that machine
The result of translation is in the main true.
In addition, user can also enter edlin to the grammer of the original language to be translated.Shown in Fig. 5 a and Fig. 5 b is wherein
A kind of implementation, syntactic structure tree.Specially:In Computer Browser, by interactive interface (such as Flash or
Java is realized), tree-like syntactic structure is shown, user is allowed it is known that the syntactic structure tree of computer analysis, relevant phrases
Marriage relation, graphical interfaces can be by the different compositions of difference of the display modes such as color, font, and the mode such as highlighted is distinguished
Choose and unchecked phrase.Meanwhile, user can click on by the dilatory of mouse, realize the change of syntactic structure tree.For example
Fig. 5 a are the syntax tree for the sentence that computer is analyzed, and computer typically calculates syntax according to existing syntactic analysis instrument
Tree, still, the original idea that does not meet user is sometimes in such result of calculation.As in Fig. 5 a, user was intended to originally
" I sees a telescope and a girl together ".User can will be apparent that by the direct marriage relation of phrase
See that " with a telescope " are for modifying " saw a girl ", and " a girl " and " with a
Telescope ", which does not have, to be combined together.So, user may determine that this syntactic analysis is wrong, by such syntax
Analysis result analyzes the object language come and is likely to incorrect.
User can delete this when, add and move sentence tree construction, generate sentence tree construction as shown in Figure 5 b,
" a girl " and " with a telescope " are combined together, and are that " saw " does object, are met the sheet of user in this structure
Carry out implication.So, user is again started up translating button, then the result of machine translation is more accurate.
Machine translation module can generate multiple syntax tree generations, can be selected with the several trees of indicating probability highest for user
Select.Such more convenient operation of user.
Machine translation module can combine part of speech and syntactic information with source language phrase, be built with object language phrase
Vertical association, such as same the Chinese phrase different English phrase of correspondence in different parts of speech, so as to improve the standard of machine translation
True property.Similar, same Chinese phrase is in heterogeneity in as sentence, the different English phrase of correspondence;In same
Literary phrase, when in different syntax tree structures, the different English phrase of correspondence.
Equally, machine translation result checks that module can also improve reverse using the subsidiary part of speech of phrase and syntactic information
The accuracy of translation.
It is distributed treatment and centralized processing that the processing of modules above, which may be, but not limited to,.
Fig. 6 show the workflow diagram for being used to improve mechanical translation quality according to the present invention.The figure is only intended to show
Example, is not limited the invention.To show result judgment step sequencing, can according to by user according to oneself
Like free adjustment.Three kinds of information are shown simultaneously in flow chart, can also according to user hobby adjust display opportunity or
Person's condition.
As shown in fig. 6, specific operating method is as follows.User inputs what is translated first in original language input module 1
Original language, step 610.Click keys 10, triggering machine translation module 2, step 620.In act 630, machine translation module 2
The original language of input is translated, and the target-language information after translation is included in module 3 is presented in machine translation result, Ran Houzai
The triggering machine translation result of click keys 11 checks module 4, and the target-language information after translation is translated into and foregoing source again
The original language comparative information that language is contrasted, and original language contrast is presented in module 5 is presented in the inspection of machine translation result
Information.Then pass through after step 630, judge in step 640, whether the result of machine translation is satisfied withIf translation result
It is satisfied, then terminates translation in step 650.If translation result is unsatisfied with, proceeds to step 660, check original language
Whether feedback information is wrong.In addition, after step 630, also judging whether original language feedback information includes in step 660
Mistake, if it is, being corrected in step 670, if without mistake, translation terminates, flow road step 650.
Although so far having described the device for being used to improve mechanical translation quality of the present invention according to embodiments of the present invention,
It is readily apparent that the invention is not limited in current embodiment, it is possible to by those skilled in the art easily by mending
The element in similar technology conception is filled, changes, deletes and/or added using in other embodiments.But, it is so changing or
The embodiment of person's change should be comprising in the claims.