Summary of the invention
Poor for overcoming translation software translation quality, the defect that human translation translation speed is slow, the invention discloses a kind of translating equipment and method fast.
The quick translating equipment of one of the present invention, comprise historical trace storehouse, input distribution module, first judge module, contrast module, screen storehouse, identification module, recovery module and display device first;
File to be translated is inputted described input distribution module, and distributes taking whole sentence as unit;
Each simple sentence to be translated that the described judgement of judge module first input module assigns obtains, judge whether this simple sentence to be translated occurs first in file to be translated, be to deposit in to screen first storehouse, otherwise this is made the mark that expression is not translated by sign module;
Described contrast module is connected with historical trace storehouse, each simple sentence to be translated that in comparison storehouse, simple sentence and input module assigns obtain, and as identical, the mark that expression is not translated made in this simple sentence to be translated by sign module;
Display device shows makes the file to be translated of all not translating mark;
Described recovery module is for after translation completes, and historical trace storehouse and the translation that screens first the whole simple sentences in storehouse returned to the correspondence position of file to be translated according to corresponding relation.
Preferably, described input distribution module is also identified as simple sentence by the letter abbreviations in file to be translated.
Concrete, what described identification module was made do not translate is designated color displays and differs from normal demonstration, and is can not editing mode.
The invention also discloses a kind of interpretation method fast, comprise the step that is simple sentence to be translated by file allocation to be translated in advance, also comprise:
I1 is by the step of the simple sentence comparison in simple sentence to be translated and historical trace storehouse: each simple sentence to be translated that in comparison storehouse, simple sentence and input module assigns obtain, and as identical, the mark that expression is not translated made in this simple sentence to be translated by sign module; Make the simple sentence of not translating mark and no longer enter I2 step;
I2 screens step first: judging whether this simple sentence to be translated occurs first in file to be translated, be to deposit in to screen first storehouse, otherwise this is made the mark that expression is not translated by sign module;
I3 recovering step: after translation completes, historical trace storehouse and the translation that screens first the whole simple sentences in storehouse are returned to the correspondence position of file to be translated according to corresponding relation.
Preferably, a coupling array D is set, to each simple sentence, stores identical with it simple sentence if do not have in D, this simple sentence is stored in D, otherwise this simple sentence is made and do not translated mark.
Concrete, the described step that is simple sentence to be translated by file allocation to be translated in advance comprises that the letter abbreviations in file to be translated is identified as simple sentence.
Adopt quick translating equipment of the present invention and method, treating translated document utilizes history file storehouse to screen in advance, repeat to filter in conjunction with self-contrast, reduce translation word quantity, taking sentence as unit, contrast ensures the translation quality of translation, through actual test, the present invention can reduce translation amount more than 30%.
Embodiment
Below in conjunction with accompanying drawing, the specific embodiment of the present invention is described in further detail.
Quick translating equipment of the present invention, comprise historical trace storehouse, input distribution module, first judge module, contrast module, screen storehouse, identification module, recovery module and display device first;
File to be translated is inputted described input distribution module, and distributes taking whole sentence as unit;
Each simple sentence to be translated that the described judgement of judge module first input module assigns obtains, judge whether this simple sentence to be translated occurs first in file to be translated, be to deposit in to screen first storehouse, otherwise this is made the mark that expression is not translated by sign module;
Described contrast module is connected with historical trace storehouse, each simple sentence to be translated that in comparison storehouse, simple sentence and input module assigns obtain, and as identical, the mark that expression is not translated made in this simple sentence to be translated by sign module;
Display device shows makes the file to be translated of all not translating mark;
Described recovery module is for after translation completes, and historical trace storehouse and the translation that screens first the whole simple sentences in storehouse returned to the correspondence position of file to be translated according to corresponding relation.
Apply time of the present invention, first file to be translated is inputed to input distribution module, input distribution module is divided into simple sentence according to certain rule by file to be translated, and common processing mode is with punctuation mark, and such as fullstop, question mark, suspension points etc. are cut apart identifier as simple sentence and marked off simple sentence.
Simple sentence pre-stored in the simple sentence that historical trace storehouse goes out Divide File to be translated and historical trace storehouse is compared, comparison principle is in full accord, be each word the and all order in tandem of words is in full accord in simple sentence, meet on all four simple sentence and make and do not translate mark.
In the process of identification simple sentence, the present invention preferably divides the letter abbreviations that has known implication out separately as simple sentence, and do not consider whether this letter abbreviations is separated by punctuation mark, for example WTO(World Trade Organization), the USA(United States of America) etc., conventionally letter abbreviations is positioned at a sentence, and first contrast is abridged, then contrasts the sentence at this abbreviation place.
Taking modal English to Chinese as example, historical trace storehouse is according to accumulation in the past or discloses simple sentence that English-Chinese document accumulates maybe to express the phrase of the complete meaning be the data library of unit storage, comprise English original text and Chinese translation one to one, as everyone knows, may there are multiple declarations of will in each word in English, but in each specific sentence, the meaning of this word immobilizes conventionally, even if the meaning of simple sentence and phrase sanctified by usage is in different context of co-texts, meaning statement is also basically identical.
In use contrast module contrast historical trace storehouse, divided simple sentence unit in simple sentence file and file to be translated, screens according to identical comparison principle, and the identical simple sentence filtering out is made and do not translated mark in waiting for translating part.Make the simple sentence of not translating mark and no longer carry out the follow-up judgement of appearance first.
After historical trace storehouse has contrasted, continue residue simple sentence to occur first judgement, whether this simple sentence occurs first in this file to be translated, judge module utilizes file to be translated self to screen first, the sentence formula repeating is filtered, among same section article, owing to being the description of the same subject write of same author, there are quite a lot of simple sentence or phrase repeatedly to occur, judge module judges whether each simple sentence occurs first in file to be translated successively first, so long as not occurring first, make and do not translate mark, to occur first being stored in screening first in storehouse.
Historical trace storehouse contrast should, early than judging first contrast, can reduce judgement calculated amount, and for example certain word occurs in the text for the first time, if these words also appear in historical trace storehouse, need to carry out historical trace and contrast to make and do not translate mark.If judge first first contrast, also need to carry out historical trace judgement and just can obtain result, for one section of article, generally, the simple sentence quantity repeating is always less than and only occurs simple sentence quantity once, and because the simple sentence accumulation in historical trace storehouse is huge, the simple sentence appearing in historical trace storehouse is often quantitatively greater than the simple sentence repeating, and therefore should will judge first postpone.
Be made the simple sentence of not translating mark in internal system, show state at display module should differ from other simple sentences, for example color displays differs from normal demonstration, and for preventing that translator from translating or misoperation voluntarily, the simple sentence of not translating can be set as can not editing mode.Translator only need to operate and translate the simple sentence that needs translation in the file to be translated that display module demonstrates, and the simple sentence repeatedly occurring in file to be translated of storing in judge module first.
After having translated, what obtain is the translation that comprises some vacancies, vacancy is to make the simple sentence correspondence position of not translating mark, system is according to the translation of storing in historical trace storehouse and the simple sentence translation in judge module first, the translation of these being made to the simple sentence of not translating mark according to corresponding relation is backfilling into the vacancy of translation, obtains complete translation.
As Fig. 2 provides a kind of embodiment that adopts loop nesting algorithm to process in real time in batches file to be translated,
System is once read in N section file to be translated, and each section of file to be translated taked to interpretation method of the present invention, first will make pauses in reading unpunctuated ancient writings in full, obtain C simple sentence, to each simple sentence, judge successively identification, judgement identification adopts cycle accumulor mode, to J sentence, first judge whether to appear in historical trace storehouse, if, make and not translating after mark, proceed the judgement of J+1 sentence, if not, continue to judge whether to occur first.
Determining step embodiment is in the present embodiment first: set up a self-defining coupling array D, when initial, this array is empty, to each simple sentence, if occur for the first time, be stored in coupling array D, carry out subsequently the judgement of J+1 sentence, in the time that this simple sentence occurs for the second time, this simple sentence is made and do not translated mark, carry out subsequently the judgement of J+1 sentence, mate thus array D and finally stored unduplicated whole simple sentences in file to be translated, whole simple sentences in complete this coupling array of translator's actual translations D, in conjunction with the historical translation of storing in historical trace storehouse, whole translations of file to be translated have been completed.Employing arranges the mode of coupling array, and data process method is simple, and program operation consumption of natural resource is few.
The software module that the method that in the present invention, the disclosed embodiments are described or the step of algorithm can directly use hardware, processor to carry out, or the combination of the two is implemented.Software module can be placed in the storage medium of any other form known in random access memory (RAM), internal memory, ROM (read-only memory) (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field.
Previously described is each preferred embodiment of the present invention, preferred implementation in each preferred embodiment is if not obviously contradictory or taking a certain preferred implementation as prerequisite, each preferred implementation arbitrarily stack combinations is used, design parameter in described embodiment and embodiment is only the invention proof procedure for clear statement inventor, not in order to limit scope of patent protection of the present invention, scope of patent protection of the present invention is still as the criterion with its claims, the equivalent structure that every utilization instructions of the present invention and accompanying drawing content are done changes, in like manner all should be included in protection scope of the present invention.