CN102262621A - Device and method for checking translated text - Google Patents

Device and method for checking translated text Download PDF

Info

Publication number
CN102262621A
CN102262621A CN2010101827827A CN201010182782A CN102262621A CN 102262621 A CN102262621 A CN 102262621A CN 2010101827827 A CN2010101827827 A CN 2010101827827A CN 201010182782 A CN201010182782 A CN 201010182782A CN 102262621 A CN102262621 A CN 102262621A
Authority
CN
China
Prior art keywords
translation
word string
inspection
check
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010101827827A
Other languages
Chinese (zh)
Inventor
钟长林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN2010101827827A priority Critical patent/CN102262621A/en
Publication of CN102262621A publication Critical patent/CN102262621A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses a device and a method for checking a translated text. The device for checking the translated text comprises a receiving unit, an analysis unit, a translated text acquisition unit and a check unit, wherein the receiving unit receives one or more documents; the analysis unit extracts a first check character string which is recorded in a first language and a second check character string which is recorded in a second language from the one or more documents, and decomposes the first check character string into at least one check element; the second check character string is the translated text which is obtained by translating the first check character string, or the first check character string is the translated text which is obtained by translating the second check character string; the translated text acquisition unit acquires at least one translated text, recorded in the second language, of each check element; and the check unit checks each of the at least one check element to search one of the translated texts, recorded in the second language, of the check element in the second check character string, and obtains a check result according to a search result.

Description

Translation testing fixture and translation inspection method
Technical field
The present invention relates to translation testing fixture and the translation inspection method checked through the translation that obtained of translation.The invention particularly relates to being the mistranslation that exists in the translation translated of the document of another kind of language by document human translation, leaking and translate and device that mistake such as redundancy is checked a kind of language.
Background technology
Current interpretative system can be divided into two kinds of human translation and mechanical translation.Mechanical translation has that speed is fast, cost is low and be difficult for characteristics such as omission, but, since language have statement flexibly, characteristic such as the result is complicated and linguistic context is abundant, to same sentence, may have multiple different understanding, based on context implication could be determined when requiring translation, therefore, gross errors such as obstructed genial mistranslation very easily appear in mechanical translation, are only limited to usually translation quality is not made the unofficial occasion that requires.By contrast, human translation can farthest accurately be expressed the original text implication, thereby reaches translation desired " fidelity, fluency, elegance ".Usually,, often require translation like clockwork, therefore all adopt the human translation mode at lot of documents such as paper, books and patents.
The human translation operation comprises translation steps and check and correction step.Particularly, by the translator original text is carried out human translation earlier, the translation that is got by proof-reader's paginal translation is manually proofreaded again.Because; in the process of translation; people's notice can not always keep concentrated; carelessness is unavoidable; therefore occur leaking through regular meeting in the translation of human translation translate, mistakes such as character calligraph or input error; this causes proof-reader's workload big, often requires one or more proof-reader manually to proofread.
For example, at law pertinent literature (for example, patent application document) etc., the leakage in the translation process is translated or some little carelessness such as wrongly written or mispronounced characters may cause gross error.Even can cause whole patent to can not get protection, thereby bring enormous economic loss for patent applicant and patent agency office.In practice, patent agency office often needs to arrange a plurality of posies that translation is proofreaded, and the check and correction of patent documentation has been become one of main operation cost of patent agency in fact.Therefore, reduce the proof-reading amount in the patent documentation translation process, to the operation cost that reduces patent agency, improve patent agency case handling efficiency most important.Similarly, at for example translation of documents such as official's document, agreement, trade literature, paper and books, there are the problems referred to above equally.
Yet, in the current human translation, at through the check and correction operation of the translation that translation steps obtained by manually finishing, this causes the workload of proofreading huge, and is difficult to almost check out that omission, redundancy and misspelling etc. in the translation are because translator's the translation error that carelessness brought.
In summary, shortcoming such as logic error appears in mechanical translation easily, statement is obstructed and translation is inelegant, and human translation occur easily leaking translate, typing error or clerical error etc.As seen, all there are obvious deficiency in human translation and mechanical translation.How making human translation and mechanical translation complimentary to one another, is an important techniques problem.
In the prior art, exist some to be intended to some interpreting equipments, also adopt manually translate translation to translate by machine translation estimate, with the valuator device of assessment machine translation quality, also exist content in the paginal translation manuscript spare carry out the artificial sampling inspection, with the equipment of assessment translation quality.But, these existing equipments all can not solve the problems of the technologies described above, they can not be prone to mechanical translation mistakes such as clerical error, omission and redundancy and human translation can make the advantage that translation is more smooth and easy, implication is more accurate combine, can not reduce the human cost that artificial translation is proofreaded, also can't improve artificial translation quality.
Summary of the invention
Technical matters to be solved by this invention is that a kind of translation testing fixture and method thereof need be provided, and is difficult to check out in the prior art in the translation because the technical matters of translator's the translation error that carelessness was brought to solve.
According to an aspect of the present invention, provide a kind of translation testing fixture.This translation testing fixture comprises: receiving element, and it receives one or more documents; Resolution unit, it extracts from described one or more documents with first of first language record and checks word string and check word string with second of second language record, and check that with described first word string is decomposed at least one and checks element, wherein, described second checks that word string is the described first inspection word string to be translated the translation of gained, and perhaps described first checks that word string is the described second inspection word string to be translated the translation of gained; The translation acquiring unit, it obtains at least one translation of representing with described second language of each described inspection element; Inspection unit, at described at least one check that in element each checks, checking one of translation of representing with described second language of the described inspection element of search in the word string, and obtain check result according to Search Results described second; Tip element, it points out described check result.
According to another aspect of the invention, provide a kind of translation inspection method.This translation inspection method is carried out following steps: receive one or more documents; From described one or more documents, extract with first of first language record and check word string and check word string with second of second language record, and check that with described first word string is decomposed at least one and checks element, wherein, described second checks that word string is the described first inspection word string to be translated the translation of gained, and perhaps described first checks that word string is the described second inspection word string to be translated the translation of gained; Obtain at least one translation of representing with described second language of each described inspection element; At described at least one check that in element each checks, checking one of translation of representing with described second language of described inspection element that search in the word string is included, and obtain check result according to Search Results described second; Point out described check result.
Technical scheme of the present invention provides the translation testing fixture that can reduce the translation cost and improve translation quality.
In addition, one embodiment of the present of invention can more accurately check Lou translate, redundant, write or translation such as input error in the mistake that exists.
By below with reference to the explanation of accompanying drawing to exemplary embodiment, other features of the present invention will become clear.
Description of drawings
Fig. 1 illustrates the structured flowchart of conduct according to the translation testing fixture 100 of the example of translation testing fixture of the present invention;
Fig. 2 is the process flow diagram of dissection process of resolution unit 102 that the translation testing fixture 100 of first embodiment of the invention is shown;
Fig. 3 is that the translation of translation acquiring unit 103 that the translation testing fixture 100 of first embodiment of the invention is shown obtains the process flow diagram of processing;
Fig. 4 illustrates the process flow diagram that the translation inspection of inspection unit 104 of the translation testing fixture 100 of first embodiment of the invention is handled;
Fig. 5 illustrates the process flow diagram that the translation inspection of inspection unit 104 of the translation testing fixture 100 of second embodiment of the invention is handled.
Embodiment
To describe the preferred embodiments of the present invention with reference to the accompanying drawings in detail now.
First embodiment
Fig. 1 illustrates the structured flowchart of conduct according to the translation testing fixture 100 of the example of translation testing fixture of the present invention.According to the translation testing fixture of present embodiment can but do not limit the mode that combines with computer hardware equipment by program software to realize.For example, can also realize by code being embedded in the mode that the treatment facility that has central processing unit and storer combines.Present embodiment supposes that first document (original text) is to write with English, and second document (translation) is write with Chinese, but this only is an example, and original text of the present invention can be any two kinds of different language with the language that translation is adopted.
Reference numeral 101 expression receiving elements; 102 expression resolution unit; 103 expression translation acquiring units; 104 expression inspection units, 105 expression Tip elements.
Receiving element 101 receives one or more documents, for example receive the English original text document of user's input and the Chinese translation document of examine respectively, wherein, this English original text document is first document with the first language record, and this Chinese translation document is second document with the second language record.Document can be electronic document, comprises the electronic document of various different-formats, for example text document or graphic documentation.Notice that receiving element 101 also can receive the Chinese translation document of English original text document or examine by equipment such as scanners, as long as can obtain the letter symbol in the document.
With reference to the accompanying drawings 2, resolution unit 102 is extracted corresponding sentence respectively from English original text document and Chinese translation document, checks that as first word string and second checks word string (step S201 and step S202) respectively.Resolution unit 102 can with for example ". ", "? " ', " ... " and ". " " etc. punctuation mark as the decollator of judging the english sentence boundary, i.e. first decollator, will be for example ".", "! ", " ... ", "? " Deng punctuation mark, as the decollator of judging Chinese sentence boundary, that is, and second decollator.A string content of extracting between two first adjacent decollators is checked word string as first, accordingly, extract and first a string content of checking between corresponding two the second adjacent decollators of word string, promptly the Chinese translation of the first inspection word string is checked word string as second.
Notice that resolution unit 102 can also be extracted corresponding paragraph respectively and check the word string and the second inspection word string as first from English original text document and Chinese translation document.Can be with punctuation mark such as " carriage return " or " new line " for example as the decollator of paragraph, can also judge the boundary of paragraph and paragraph according to for example delegation's literal does not fill up a full line and next line has top lattice or indentation two lattice etc. with figured feature.
Then, resolution unit 102 checks that with first word string is decomposed at least one and checks element (step S203).Particularly, in the present embodiment, resolution unit 102 checks that with first word string is decomposed into for example inspection elements such as word, punctuation mark, paragraph delimiter and phrase.
In the present embodiment, the language that original text adopted is English, and resolution unit 102 can be with " space " in the sentence and symbols such as ", " as the first sub-decollator, check that with first word string is decomposed into each different word, each word is checked element as one.For example, for english sentence " This is a printer apparatus. ", the inspection element that decomposites can comprise " This ", " is ", " a ", " printer ", " printerapparatus ", " apparatus " and ". ".
Note, because " printer apparatus " will occur as a phrase usually, therefore the resolution unit of translation testing fixture 100 can check that word string be decomposed on the basis of word with first, and the combination of further will be a plurality of adjacent word is defined as the another inspection element except that the inspection element of word composition.For example, can wait to determine whether adjacent a plurality of words constitute phrase, if then these adjacent a plurality of words are determined as another inspection element with reference to dictionary, user input values.Like this, both " printer apparatus " can be able to be decomposed into that " " printer " and " apparatus " also can be decomposed into it " printer apparatus ".
Then, resolution unit 102 judges whether to have finished the dissection process (step S204) to entire document, if judged result is a "Yes", then finishes dissection process, otherwise, return step S201.Note, can also be after 102 pairs one first of the resolution unit of translation testing fixture 100 of the present invention check that word string is resolved, first check that word string carries out follow-up translation and obtain and handle and the inspection processing by what the translation acquiring unit of translation testing fixture 100 and 104 pairs of this process of inspection unit were resolved.
Translation acquiring unit 103 carries out translation and obtains processing, obtains the target translation of the inspection element that resolution unit 102 decomposited, and in this example, the target translation is a Chinese.At first, obtain the inspection element (step S301) that decomposites by resolution unit 102.For example, at the first inspection word string " This is a printer apparatus. ", decomposite " This ", " is ", " a ", " printer ", " printer apparatus ", " apparatus " and ". " etc. above the input and check element.Then, carry out the translation that translation acquiring unit 103 carried out and obtain processing.Obtain in the processing at translation, translation acquiring unit 103 receives each inspection element that is decomposited by resolution unit 102, checks element at each that is decomposited by resolution unit 102, obtains their at least one translations (step S302) separately.Preferably, preferentially obtain by user-defined translation.In addition, the translation as much as possible of checking element can be obtained, for example,, " equipment ", " instrument ", " device " and a plurality of translations such as " instrument " can be obtained for " apparatus ".The obtain manner of translation is not limit, as long as can obtain translation.For example, can import translation in advance, also can obtain translation by modes such as inquiry local and remote database or online resources by the user.
Note, can also preestablish translation non-the translate inspection element identical with original text, non-translating checks that element refers to the inspection element that translation is identical with original text.At the predetermined non-inspection element of translating, translation acquiring unit 103 can directly will should non-ly be translated the direct translation as this inspection element of the original text of checking element itself, and did not obtain its translation by alternate manner.For example, Reference numeral in the patent documentation etc. can be set at the non-inspection element of translating.In addition, can also set the non-judgment rule of checking element of translating, certain when regular when meeting, being judged as this inspection element is the non-inspection element of translating.For example, at the inspection element that with English is original text, can comprise the inspection element of numeral or all do not need probably to be translated by the inspection element that numeral is formed, for example " 300 ", " S300 " or " T1 " etc. are generally Reference numeral in patent documentation, all do not need to be translated, therefore they can be defined as the non-inspection element of translating.
In addition, can also preestablish the inspection element that need not translate in translation, the inspection element that will need not translate (that is, need not to embody its implication in translation) in translation is called does not translate the inspection element.In the present embodiment, for example ", " in the English original text, space, tab etc. can be set at and not translate the inspection element.At not translating the inspection element, translation acquiring unit 103 does not obtain its translation.
Then, in step S303, judge whether that inspection element to all examine has carried out translation and obtained, if the judgment is Yes, then finish translation and obtain processing, otherwise, step S301 returned.
Obtain the translation of checking element by translation acquiring unit 103 after, check processing by inspection unit 104.At the first inspection word string of resolving through resolution unit 102, to input to inspection unit 104 (step S401) from its inspection element that decomposites, and one of the translation of this inspection element that will be obtained by described translation acquiring unit 103 also inputs to inspection unit 104 (step S402).Note, do not translate and check that element does not input to inspection unit 104, that is the inspection element for not obtaining its translation does not input to inspection unit 104.
Then, inspection unit 104 is searched for the translation of being imported (step S403) in the translation of correspondence, if searched the translation of being imported, then obtains about there not being the check result of translation error, otherwise enters step S404.In step S404, inspection unit 104 has judged whether to check all translations that the translation acquiring unit is obtained at this inspection element, if, then obtain check result (step S405) about there being the translation mistake, otherwise, return step S402, continue other translation of checking that this checks element.
After step S405, enter step S406.Inspection unit 104 judges whether to have checked from first all that check that word string decomposites at step S406 checks elements, if, end process then, otherwise return step S401.
Tip element 105 is to the check result of user prompt inspection unit 104.For example, can not put on error identification or remarks by may be translated or be translated on wrong literal or the symbol in original text or original text copy.Also can translation or the translation copy in may redundant translation, omit translation or translate marked erroneous sign or remarks on wrong literal or the symbol.Perhaps also can generate the report file of errors present, error reason or the type of error etc. of having put down in writing translation, as long as can clearly point out the user to make mistakes/correct situation.By checking the prompting of Tip element 105, which local existence mistake the user can easily check and make amendment at the position and the content of prompting.
It should be noted that described first checks that word string and the described second inspection word string belong to same document, perhaps adhere to different document separately.The first inspection word string and the second inspection word string are obtained from described first document and second document respectively in the present embodiment, also promptly adhere to different document separately.In other embodiments, first checks that the word string and the second inspection word string also can be to obtain from same documents such as same e-file or same written document, also promptly belong to same document.For example, first check that the word string and the second inspection word string can alternately leave in the same e-file.More specifically, for example, adjacently first check that word string back or front are placed with one and first check that word string corresponding second checks word string with this at each, perhaps, adjacent in some first inspection word string back or fronts, what be placed with respective numbers checks word strings with these some first inspection word strings corresponding some second, and like this, (some) first check that word string and (some) second check that word string mutual group becomes an e-file.And for example, can check that word string is placed on the front portion of e-file, check that with second word string is placed on the rear portion of e-file first.Hence one can see that, and when checking that with first word string and second checks that word string is placed on same document, the position of the two can be handled flexibly, checks the word string and second corresponding relation of checking between the word string as long as can identify first.
In addition, it shall yet further be noted that receiving element only receives a document and gets final product when checking that with first word string and second checks that word string is placed on same document.Similarly, those skilled in the art also can extract first as can be known and check the word string or the first inspection word string from a document or a plurality of document, first of translation check the word string and the second inspection word string each other as long as can extract mutually.
Second embodiment
Then will be with reference to the translation testing fixture of the flowchart text shown in the figure 5 according to second embodiment of the invention.Structure according to the translation testing fixture of present embodiment can be identical with the structure of the translation testing fixture described in first embodiment, therefore, and with the explanation of omitting to the structure of the translation testing fixture of present embodiment.Among Fig. 5 with first embodiment in identical Reference numeral represent to carry out the step of handling (Fig. 4) identical processing with the inspection of first embodiment, and will omit explanation to these same steps as.Below the step different with first embodiment will only be described.
According to the step S507 of Fig. 5 as can be known, in the inspection of second embodiment is handled, after with the first inspection element input checking unit of checking in the word string 104, this checks that element is in this first occurrence number of checking in the word string, as first occurrence number inspection unit 104 statistics.Note, also can after decompositing all inspection elements of the first inspection word string, add up first occurrence number by resolution unit.
In the present embodiment, the step S503 among Fig. 5 is also different with step S403 among Fig. 4.In the step S403 of Fig. 4, when inspection unit 104 searches translation in the second inspection word string, obtain check result.More specifically,, then obtain about there not being the check result of translation error if searched one of translation of checking element, otherwise, then obtain about there being the result of translation error.In the step S503 of Fig. 5, the translation that the inspection unit statistics is imported is checking second whether the translation number (being called second occurrence number) in the word string equals 0.If second occurrence number equals zero, then enter the step S404 identical with first embodiment, omit its explanation at this.If second occurrence number is not equal to zero, then enter step S505, promptly obtain check result.For example, in second occurrence number greater than zero and during less than first occurrence number, can obtain about may there being the result of translation error, and in second occurrence number during, can obtain second to check the result who does not have translation error in the word string about this more than or equal to first occurrence number.
The 3rd embodiment
Above-mentioned first and second embodiment illustration according to first or second decollator, extract first respectively from original text document and translation document and check that word string and second checks word string, and, check that with first word string is decomposed into a plurality of inspection elements according to the first sub-decollator.But the parsing that resolution unit of the present invention is carried out is not limited thereto.In the present embodiment, resolution unit can also according to original text or translation document with key words such as each chapters and sections or each most title, chapters and sections symbol, original text and translation document are divided into corresponding several big part respectively.Notice that each chapters and sections or the each several part key word that are used for dividing same document can be different.For example, in Chinese patent specification, can utilize " technical field ", " background technology ", " summary of the invention ", " accompanying drawing summary " (perhaps " description of drawings ") and " specific embodiment " five key words will be divided into five major parts as the instructions of the Chinese patent document of original text successively.Similarly, can utilize " BACKGROUND ", " Technical Field ", " SUMMARY OFTHE INVENTION ", " " BRIEF DESCRIPTION OF THEDRAWINGS " and " DETAILED DESCRIPTION OF PREFERREDEMBODIMENTS " five key words are divided into the instructions of English patent document and corresponding successively five major parts of instructions as the Chinese patent document of translation.After carrying out so big joint division,, carry out respectively, from each corresponding big joint, extract corresponding first or second respectively and check word string as first embodiment and the similar dissection process of second embodiment again at each major part.Like this, can further optimize first and second embodiment of the present invention, improve the processing speed of resolution unit, further improve the bug check ability.
The 4th embodiment
Then explanation is according to the translation testing fixture of fourth embodiment of the invention.The structure of the translation testing fixture of present embodiment can be identical with the structure of the translation testing fixture described in first to three embodiment, therefore, and with the explanation of omission to the structure of the translation testing fixture of present embodiment.Below the processing different with first embodiment will only be described.
The key distinction of the present embodiment and first to the 3rd embodiment is, the inspection unit of present embodiment checks that also the ratio of length of second corresponding in length that first in the original text check word string and the translation inspection word string is whether in preset range.
The present inventor finds, a language is translated as after another language the number of words ratio substantially constant of original text and translation.For example, the number of words of English original text and Chinese translation (word number)/the number of words ratio is approximately 1: 1.6, and the number of words ratio of Japanese original text and Chinese translation is about 1: 0.8.Therefore, the inspection unit of present embodiment whether in preset range, also can obtain the check result whether translation may exist mistake by the ratio of length judging first in the original text and check second corresponding in the length of word string and the translation inspection word string.Particularly, be example with English original text and Chinese translation, if the number of words ratio of word number in the Chinese original text document and Chinese translation less than 1: 1.1 or greater than 1: 1.9, then can directly be obtained the check result that may have mistake in the translation.Notice that the value of this scope is according to the intertranslation between different language and difference.Even for the macaronic intertranslation of difference, also can be and difference according to the desired wrong strict degree of judging.Here calculate the ratio that first length and second of checking word string is checked the length of word string with word number or number of words, but, also can weigh first and check the length of the word string and the second inspection word string, as long as can reflect the length ratio of original text and translation according to the shared real bytes quantity of literal code.
The 5th embodiment
In addition, the key distinction of the structure of the structure of the translation testing fixture of present embodiment and the translation testing fixture of first embodiment is that the translation testing fixture of present embodiment also comprises translation preservation unit.Judge second when inspection unit and to check when searching the translation of being imported in the word string, that is when the judged result of step S403 was "Yes", inspection unit was judged the translation of whether preserving this inspection element in the translation preservation unit.If inspection unit is judged translation and is preserved the translation of having preserved this inspection element in the unit, then inspection unit judges second to check whether search the translation of being imported in the word string identical with the translation of preserving this inspection element of being preserved in the unit at translation, and according to judged result acquisition check result.For example, check that the translation of being imported that searches in the word string is inequality with the translation of preserving this inspection element of being preserved in the unit at translation if judge second, then output needle is checked the check result of the translation inconsequent of element to this.If inspection unit is judged translation and preserved the translation of not preserving this inspections element in the unit, then inspection unit will check that the translation of being imported that search in the word string is kept in the translation preservation unit as the translation of this inspection translation second.
Except that above-mentioned record, the others of present embodiment can be identical with first embodiment.
Other embodiment
As mentioned before, among the present invention, can preset first decollator, be used for the first inspection word string is cut apart.Yet, the present invention can also set a plurality of predetermined strings at the first specific decollator, when being predetermined strings before or after first checks one first decollator of a certain position of word string, resolution unit can not resolved according to this first decollator.For example, suppose ". " as first a specific decollator, predeterminable " No ", " NO ", " FIG " and word strings such as " Fig ", as predetermined strings, if being right after ". " is these predetermined strings before, then not with this ". " of this position as decollator, promptly this ". " do not extracted first as boundary and checks that word string or second checks word string, but should ". " treat as common character or literal.
In addition, in the foregoing description, all be to judge mistake in the translation as the basis with the original text document, promptly, decompose the first inspection word string of extracting from the original text document by inspection unit, and judge by the translation of searching for the inspection element that is analyzed in the second inspection word string in the translation document whether translation exists mistake.But, inspection unit of the present invention can also analyze from the translation document, extract second check word string, parse a plurality of inspection units from the second inspection word string of being extracted, and obtain a plurality of translations of the original text language of these a plurality of inspection elements, in the original text document, search for whether there are one of these a plurality of translations in the corresponding first inspection word string then, to determine whether to exist mistake.For example, when in original text first when checking that not searching second in the translation document in the word string checks the translation of one of inspection element in the word string, can be judged as this and second check that to have translation in word string redundant.That is to say, the present invention also can will change processing to the translation document into to the processing of original text document in the foregoing description, to in the foregoing description the processing of translation document be changed into processing to the original text document simultaneously, with the redundancy that may exist in the better inspection translation, mistranslate, write mistake such as make mistakes.
In addition, the unit of various embodiments of the present invention and treatment scheme can mutually combine, to form more technical scheme.
In the present invention, " setting " both can have been operated by the user and carry out, and also can be carried out automatically by system.
In the present invention, " decollator " not only can be a plurality of or single character, also can be character, word string and the phrase etc. of user or systemic presupposition.
In the present invention, " char " refers to all kinds of symbols that may use in each national literal, punctuate, mathematic sign, various code (for example ASCII etc.), decollator (for example, carriage return character, section break etc.) and the document in the document.
In the present invention, Tip element can be pointed out described check result in every way, can be textual representation, it also can be diagrammatic representation, can in original text or translation, represent, also can in the copy of original text or translation, represent, can also in being different from the independent text of original text or translation, put down in writing.For example, can be with the highlighted redundancy that expresses possibility of redness in translation, can be with the highlighted translation inconsequent that expresses possibility of yellow, the perhaps content that can in original text or its copy, express possibility and not translate in translation or its copy with red height.In a word, the expression mode is not limit, as long as the user can understand.
The present invention can also be by to realize with the corresponding a plurality of translation inspection methods of each embodiment.Because each translation inspection method is corresponding with the translation testing fixture of each embodiment, omits its detailed description at this.Those skilled in the art above-mentionedly can directly draw corresponding translation inspection method after at the explanation of translation testing fixture reading.
Though the invention has been described for reference example embodiment, should be appreciated that the present invention is not limited to disclosed exemplary embodiment.The scope of claims meets the wideest explanation, to comprise all modifications and equivalent structure and function.

Claims (11)

1. translation testing fixture, it comprises:
Resolution unit, it extracts from one or more documents with first of first language record and checks word string and check word string with second of second language record, and check that with described first word string is decomposed at least one and checks element, wherein, described second checks that word string is the described first inspection word string to be translated the translation of gained, and perhaps described first checks that word string is the described second inspection word string to be translated the translation of gained;
The translation acquiring unit, it obtains at least one translation of representing with described second language of each described inspection element;
Inspection unit, at described at least one check that in element each checks, checking one of translation of representing with described second language of the described inspection element of search in the word string, and obtain check result according to Search Results described second;
Tip element, it points out described check result.
2. translation testing fixture according to claim 1, wherein, described first checks that word string and the described second inspection word string belong to same document, perhaps described first checks that word string and the described second inspection word string adhere to different document separately.
3. translation testing fixture according to claim 1, wherein, described translation testing fixture further comprises:
Default unit, its default at least one first decollator and at least one second decollator;
Wherein, described resolution unit is extracted described first according to described first decollator and is checked word string, and extracts described second according to described second decollator and check word string.
4. translation testing fixture according to claim 3, wherein, when the one or more characters adjacent with described first decollator constituted predetermined strings, described resolution unit was not extracted according to this described first decollator adjacent with described predetermined strings.
5. translation testing fixture according to claim 1, wherein, described resolution unit is also preset at least one sub-decollator, and checks that with described first word string is decomposed into described at least one inspection element according to described at least one sub-decollator.
6. translation testing fixture according to claim 5, wherein, described resolution unit also judges whether to check that with described first the combination of a plurality of described inspection element adjacent in the word string is defined as another described inspection element.
7. translation testing fixture according to claim 1, wherein, described inspection unit is also checked the ratio of length that described first length and described second of checking word string checks word string whether in preset range, wherein,
When the ratio of described length is not in described preset range, directly obtain described check result, otherwise, check that described second checks described at least one translation that whether comprises each described inspection element in the word string, obtain described check result.
8. translation testing fixture according to claim 1, wherein,
Described translation acquiring unit is that be scheduled to non-translated when checking one of element judging described inspection element, and with the original text of the described inspection element translation as described inspection element itself, wherein, the described non-inspection element of translating refers to the inspection element that translation is identical with original text;
Described translation acquiring unit is that not translating of being scheduled to is when checking one of element judging described inspection element, described inspection unit does not check described second checks in the word string whether comprise one of described at least one translation of described inspection element, and wherein said not translating checks that element refers to the inspection element that need not to embody in translation.
9. translation testing fixture according to claim 1, wherein,
Described resolution unit or inspection unit are also added up each described inspection element in described first occurrence number of checking in the word string, as first occurrence number;
Described inspection unit is also added up described at least one translation of representing with described second language of each described inspection element in described second occurrence number of checking in the word string, as second occurrence number; And
Described inspection unit is also judged the magnitude relationship of described first occurrence number and described second occurrence number, and also obtains described check result according to judged result.
10. translation testing fixture according to claim 1, wherein, described translation testing fixture also comprises translation preservation unit, described translation preservation unit is used to preserve by described inspection unit checks the translation of the inspection element that element has searched at each,
Wherein, when described inspection unit searches described second when checking one of the translation of representing with described second language of described inspection element included in the word string, described inspection unit is also before described translation is preserved one of described translation of representing with described second language of unit preservation, judge whether one of described translation of representing with described second language is identical with the translation of preserving the described inspection element of being preserved in the unit at described translation, and obtain check result according to judged result.
11. a translation inspection method, described inspection method is carried out following steps:
Receive one or more documents;
From described one or more documents, extract with first of first language record and check word string and check word string with second of second language record, and check that with described first word string is decomposed at least one and checks element, wherein, described second checks that word string is the described first inspection word string to be translated the translation of gained, and perhaps described first checks that word string is the described second inspection word string to be translated the translation of gained;
Obtain at least one translation of representing with described second language of each described inspection element;
At described at least one check that in element each checks, checking one of translation of representing with described second language of described inspection element that search in the word string is included, and obtain check result according to Search Results described second;
Point out described check result.
CN2010101827827A 2010-05-26 2010-05-26 Device and method for checking translated text Pending CN102262621A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010101827827A CN102262621A (en) 2010-05-26 2010-05-26 Device and method for checking translated text

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010101827827A CN102262621A (en) 2010-05-26 2010-05-26 Device and method for checking translated text

Publications (1)

Publication Number Publication Date
CN102262621A true CN102262621A (en) 2011-11-30

Family

ID=45009253

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010101827827A Pending CN102262621A (en) 2010-05-26 2010-05-26 Device and method for checking translated text

Country Status (1)

Country Link
CN (1) CN102262621A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929865A (en) * 2012-10-12 2013-02-13 广西大学 PDA (Personal Digital Assistant) translation system for inter-translating Chinese and languages of ASEAN (the Association of Southeast Asian Nations) countries
CN104731777A (en) * 2015-03-31 2015-06-24 网易有道信息技术(北京)有限公司 Translation evaluation method and device
CN104778155A (en) * 2014-01-09 2015-07-15 阿里巴巴集团控股有限公司 Page content processing method and device
CN104978309A (en) * 2014-04-14 2015-10-14 阿里巴巴集团控股有限公司 Translation abnormity determining method and translation abnormity determining equipment
CN105468697A (en) * 2015-11-18 2016-04-06 成都优译信息技术有限公司 Automatic positioning method used for translation teaching system
CN106354731A (en) * 2015-07-16 2017-01-25 中兴通讯股份有限公司 Document inspection method and device
CN107135429A (en) * 2017-05-12 2017-09-05 武汉斗鱼网络科技有限公司 Barrage message resolution method, device and electronic equipment
CN107301252A (en) * 2017-08-10 2017-10-27 传神联合(北京)信息技术有限公司 The method and device of former translation matching
CN107885728A (en) * 2017-12-11 2018-04-06 中译语通科技股份有限公司 A kind of QA automatic testing methods and system based on interpreter's translation on line
CN109783826A (en) * 2019-01-15 2019-05-21 四川译讯信息科技有限公司 A kind of document automatic translating method
CN111798190A (en) * 2019-04-03 2020-10-20 阿里巴巴集团控股有限公司 Method and system for processing translation case
CN112632282A (en) * 2020-12-30 2021-04-09 中科院计算技术研究所大数据研究院 Chinese and English thesis data classification and query method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040006466A1 (en) * 2002-06-28 2004-01-08 Ming Zhou System and method for automatic detection of collocation mistakes in documents
CN101520779A (en) * 2009-04-17 2009-09-02 哈尔滨工业大学 Automatic diagnosis and evaluation method for machine translation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040006466A1 (en) * 2002-06-28 2004-01-08 Ming Zhou System and method for automatic detection of collocation mistakes in documents
CN101520779A (en) * 2009-04-17 2009-09-02 哈尔滨工业大学 Automatic diagnosis and evaluation method for machine translation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MAJA POPOVI´C ET AL: "Word Error Rates: Decomposition over POS Classes and Applications for Error Analysis", 《PROCEEDINGS OF THE SECOND WORKSHOP ON STATISTICAL MACHINE TRANSLATION》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929865A (en) * 2012-10-12 2013-02-13 广西大学 PDA (Personal Digital Assistant) translation system for inter-translating Chinese and languages of ASEAN (the Association of Southeast Asian Nations) countries
CN104778155B (en) * 2014-01-09 2017-12-15 阿里巴巴集团控股有限公司 The processing method and processing device of page official documents and correspondence
CN104778155A (en) * 2014-01-09 2015-07-15 阿里巴巴集团控股有限公司 Page content processing method and device
CN104978309A (en) * 2014-04-14 2015-10-14 阿里巴巴集团控股有限公司 Translation abnormity determining method and translation abnormity determining equipment
CN104978309B (en) * 2014-04-14 2018-12-14 阿里巴巴集团控股有限公司 A kind of determination method and apparatus that translation is abnormal
CN104731777A (en) * 2015-03-31 2015-06-24 网易有道信息技术(北京)有限公司 Translation evaluation method and device
CN104731777B (en) * 2015-03-31 2019-02-01 网易有道信息技术(北京)有限公司 A kind of translation evaluation method and device
CN106354731A (en) * 2015-07-16 2017-01-25 中兴通讯股份有限公司 Document inspection method and device
CN105468697A (en) * 2015-11-18 2016-04-06 成都优译信息技术有限公司 Automatic positioning method used for translation teaching system
CN107135429A (en) * 2017-05-12 2017-09-05 武汉斗鱼网络科技有限公司 Barrage message resolution method, device and electronic equipment
CN107135429B (en) * 2017-05-12 2019-10-25 武汉斗鱼网络科技有限公司 Barrage message resolution method, device, electronic equipment and computer-readable storage media
CN107301252A (en) * 2017-08-10 2017-10-27 传神联合(北京)信息技术有限公司 The method and device of former translation matching
CN107885728A (en) * 2017-12-11 2018-04-06 中译语通科技股份有限公司 A kind of QA automatic testing methods and system based on interpreter's translation on line
CN109783826A (en) * 2019-01-15 2019-05-21 四川译讯信息科技有限公司 A kind of document automatic translating method
CN109783826B (en) * 2019-01-15 2023-11-21 四川译讯信息科技有限公司 Automatic document translation method
CN111798190A (en) * 2019-04-03 2020-10-20 阿里巴巴集团控股有限公司 Method and system for processing translation case
CN111798190B (en) * 2019-04-03 2024-01-23 阿里巴巴集团控股有限公司 Method and system for processing translation document
CN112632282A (en) * 2020-12-30 2021-04-09 中科院计算技术研究所大数据研究院 Chinese and English thesis data classification and query method

Similar Documents

Publication Publication Date Title
CN102262621A (en) Device and method for checking translated text
US7707026B2 (en) Multilingual translation memory, translation method, and translation program
CN109918640B (en) Chinese text proofreading method based on knowledge graph
CN111259652B (en) Bilingual corpus sentence alignment method and device, readable storage medium and computer equipment
Diab et al. Tharwa: A Large Scale Dialectal Arabic-Standard Arabic-English Lexicon.
CN103493041A (en) Automatic sentence evaluation device using shallow parser to automatically evaluate sentence, and error detection apparatus and method for same
KR101509727B1 (en) Apparatus for creating alignment corpus based on unsupervised alignment and method thereof, and apparatus for performing morphological analysis of non-canonical text using the alignment corpus and method thereof
CN102937949A (en) Method and system for checking English spelling in rich text editor
JP2009151777A (en) Method and apparatus for aligning spoken language parallel corpus
CN112949324A (en) Method, system and terminal for translating and managing aircraft maintenance technical manual data
Daems et al. On the origin of errors: A fine-grained analysis of MT and PE errors and their relationship.
Tursun et al. Noisy Uyghur text normalization
Lone et al. Machine intelligence for language translation from Kashmiri to English
Ganfure et al. Design and implementation of morphology based spell checker
Duran et al. Some issues on the normalization of a corpus of products reviews in Portuguese
CN104933030A (en) Uygur language spelling examination method and device
Warburton Processing terminology for the translation pipeline
CN111985232A (en) NLP-based field model extraction method for airborne display and control system requirements
Lehal et al. Sangam: A Perso-Arabic to Indic script machine transliteration model
CN109344389B (en) Method and system for constructing Chinese blind comparison bilingual corpus
Hocking et al. Optical character recognition for South African languages
CN107590132B (en) Method for automatically correcting part of characters-judging by English part of speech
KR101052004B1 (en) Translation service provision method and system
CN101425087A (en) Method and system for constructing dictionary
Zheng et al. Why press backspace? Understanding user input behaviors in Chinese Pinyin input method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20111130

WD01 Invention patent application deemed withdrawn after publication