CN103577397A - Computer translation data processing method and computer translation data processing device - Google Patents

Computer translation data processing method and computer translation data processing device Download PDF

Info

Publication number
CN103577397A
CN103577397A CN201210285384.7A CN201210285384A CN103577397A CN 103577397 A CN103577397 A CN 103577397A CN 201210285384 A CN201210285384 A CN 201210285384A CN 103577397 A CN103577397 A CN 103577397A
Authority
CN
China
Prior art keywords
semantic pattern
classification
source statement
translation
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210285384.7A
Other languages
Chinese (zh)
Inventor
吴克文
廖剑
张永刚
林锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201210285384.7A priority Critical patent/CN103577397A/en
Publication of CN103577397A publication Critical patent/CN103577397A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention provides a computer translation data processing method which comprises the following steps: receiving a to-be-translated source statement and segmenting the source statement; searching terms obtained by segmenting in a classified dictionary; determining the category of each term; searching and determining the semantic mode of the source statement in a semantic mode database according to the category of each term in the source statement; searching a translation rule corresponding to the semantic mode; translating the source statement according to the translation rule. The invention further provides a computer translation data processing device for realizing the method. The computer translation data processing method and the computer translation data processing device can be used for solving the problem that an example statement occupies the system space, improving the translation searching efficiency and quickening the system response speed.

Description

Machine translation data processing method and device
Technical field
The application relates to computer-aided translation technical field, particularly relates to a kind of machine translation data processing method and device.
Background technology
Along with the fast development of science and technology and internet, the own every aspect of working, living through going deep into us of computer and network technologies.In translation field, also there is computer-aided translation technology, for example common google translation, Baidu's translation Huo You road translation etc.
Memory translation realized in the common with good grounds corpus of method of computer-aided translation, be about to sentence to be translated and be decomposed into several words, then by means of the example translation of having stored, the word decomposing is out translated, finally again the result after translation is combined.For example, needing the sentence of translation is " he has bought a book ", can be decomposed into " he, bought, a book ", then in system, search corresponding translation instance, for example, find " she is reading a book: sheis reading a book " and " he has bought a computer: he bought a computer ", so just can translate and obtain " he, bought, a book " decomposing word out or phrase, finally combine and obtain translation result " he bought a book ".
This kind of mode can be sub-divided into sentence very little particle, thereby can improve translation quality.But because need accurate matching operation, in order to guarantee matching rate, need in system or in database, safeguard a large amount of example phrase data, this will take a large amount of data spaces.Meanwhile, in a large amount of example phrase data, the identical word of match query also needs to spend more query time, thereby causes the response speed of system slower.When the concurrency of sentence to be translated is larger, also may cause system crash.In addition, because need accurate matching operation, need sentence to be translated to there is the sentence formula of standard, but at some special message area, its a large amount of sentences to be translated may not have standard format, and so just may occur situation about cannot mate now often needs artificial modification sentence to be translated repeatedly to inquire about, until obtain the result of user's expectation, this can increase the load of system undoubtedly.
Summary of the invention
The application provides a kind of machine translation data processing method and device, can solve that example statement takies a large amount of system spaces, translation and inquiry efficiency is low, the problem that system responses is slow.
In order to address the above problem, the application discloses a kind of machine translation data processing method, comprises the following steps:
Receive source statement to be translated, described source statement is carried out to cutting;
The word that cutting is obtained is inquired about in classified dictionary, determines the classification of each word;
According to the classification of described each word of source statement, in semantic pattern database, search the semantic pattern of determining described source statement;
Search translation rule corresponding to described semantic pattern, according to described translation rule, source statement is translated.
Further, described searching in semantic pattern database according to the classification of described each word of source statement determines that the semantic pattern of described source statement comprises:
Determine the classification combination of source statement;
The classification obtaining is combined to the semantic pattern of searching coupling in the database of substitution semantic pattern respectively, if can find, obtain described semantic pattern;
Categorical measure in the corresponding classification combination of semantic pattern of relatively more described each coupling, chooses the classification that categorical measure is maximum and combines the semantic pattern that corresponding semantic pattern is source statement.
Further, described in, choosing classification that categorical measure is maximum, to combine corresponding semantic pattern be that the semantic pattern of source statement comprises:
Judge whether the classification combination that described categorical measure is maximum is whole word classifications of source statement, if so, chooses the classification that described categorical measure is maximum and combines the semantic pattern that corresponding semantic pattern is source statement;
If not, judge whether the remaining word classification combination of source statement has corresponding semantic pattern, if, obtain its semantic pattern, and combine corresponding semantic pattern jointly as the semantic pattern of source statement with the maximum classification of described categorical measure, if not, the maximum classification of described categorical measure is combined to corresponding semantic pattern as the semantic pattern of source statement.
Further, the combination of the classification of described definite source statement comprises:
If categorical measure N is 2, classification is combined as one;
If categorical measure N>2, classification combination one is total N-1, comprises that the first two classification starting from first classification is first category combination; First three classification starting from first classification is the second classification combination, until the N starting from first a classification classification is the combination of N-1 classification.
Further, if the combination that the semantic pattern of source statement is at least two semantic patterns is describedly translated and is comprised source statement according to described translation rule:
According to the corresponding translation rule of each semantic pattern, corresponding part in source statement is translated and obtained part translation result, described part translation result combination is obtained to the final translation result of source statement; Or
According to the corresponding translation rule of each semantic pattern, corresponding part in source statement is translated and obtained part translation result, obtain the translation rule between each semantic pattern, according to described translation rule, part translation result is adjusted, obtained the final translation result of source statement.
Disclosed herein as well is a kind of machine translation data processing equipment, comprising:
Data acquisition module, for receiving source statement to be translated, carries out cutting to described source statement;
Classification determination module, inquires about at classified dictionary for the word that cutting is obtained, and determines the classification of each word;
Semantic pattern determination module, for searching the semantic pattern of determining described source statement at semantic pattern database according to the classification of described each word of source statement;
Translation module, for searching translation rule corresponding to described semantic pattern, translates source statement according to described translation rule.
Further, described semantic pattern determination module comprises:
Classification combination determining unit, for determining the classification combination of source statement;
Semantic pattern matching unit, for the classification obtaining being combined to the semantic pattern that the database of substitution semantic pattern is respectively searched coupling, if can find, obtains described semantic pattern;
Relatively choose unit, for the categorical measure of the corresponding classification combination of the relatively more described semantic pattern that each mates, choose the classification that categorical measure is maximum and combine the semantic pattern that corresponding semantic pattern is source statement.
Further, the described unit of relatively choosing comprises:
Judgment sub-unit, for judging whether the classification combination that described categorical measure is maximum is whole word classifications of source statement, if so, chooses the classification that described categorical measure is maximum and combines the semantic pattern that corresponding semantic pattern is source statement;
If not, judge whether the remaining word classification combination of source statement has corresponding semantic pattern, if, obtain its semantic pattern, and combine corresponding semantic pattern jointly as the semantic pattern of source statement with the maximum classification of described categorical measure, if not, the maximum classification of described categorical measure is combined to corresponding semantic pattern as the semantic pattern of source statement.
Further, if the combination that the semantic pattern of source statement is at least two semantic patterns, described translation module comprises:
Translation unit, for translating and obtain part translation result source statement corresponding part according to the corresponding translation rule of each semantic pattern;
Assembled unit, for obtaining described part translation result combination the final translation result of source statement; Or according to the translation rule between semantic pattern, described part translation result is adjusted, obtain the final translation result of source statement.
Compared with prior art, the application comprises following advantage:
The application's machine translation data processing method and device, by source statement is split, determine according to the classification of each word the semantic pattern that it is final, then according to translation rule corresponding to semantic pattern, source statement are translated.The semantic pattern of source statement can be mated completely with source statement, can be also part coupling, even if therefore source statement is not standard sentence formula, the result that also can partly mate is translated, and has guaranteed translation quality.In system, only need to safeguard a small amount of semantic pattern and translation rule, without a large amount of example statement of storage, thereby can reduce data to the taking of system resource, simultaneously because the minimizing of storage data volume can improve search efficiency and system response time.
Secondly, source statement for multiple semantic pattern combination can also additionally arrange the translation rule between semantic pattern, for the source statement of non-standard sentence formula, also can guarantee the accuracy of translation, thereby can improve the scope of application of system, avoid the system load causing because user repeatedly inquires about to increase.
Certainly, arbitrary product of enforcement the application not necessarily needs to reach above-described all advantages simultaneously.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the application's machine translation data processing method embodiment mono-;
Fig. 2 is the structural representation of the application's machine translation data processing equipment embodiment mono-;
Fig. 3 is the system architecture instance graph of terminal with the application's machine translation data processing equipment.
Embodiment
For the application's above-mentioned purpose, feature and advantage can be become apparent more, below in conjunction with the drawings and specific embodiments, the application is described in further detail.
With reference to Fig. 1, a kind of machine translation data processing method embodiment mono-of the application is shown, comprise the following steps:
Step 101, receives source statement to be translated, and described source statement is carried out to cutting.
To source statement cutting, can adopt various ways to realize, for example, be example during the English of take turns over, and source statement is " leather case for mobile phone ".Source statement is carried out after cutting obtaining " leather case ", " for ", " mobile phone " three words.
Step 102, the word that cutting is obtained carries out query analysis in classified dictionary, determines the classification of each word.
Classified dictionary can be collected the word that may comprise of all categories after determining the classification of word, finally all words is gathered.
The classification of each word can be distinguished according to actual conditions, for example, can be noun, adjective, verb etc.In special translating equipment, can also special classification be set correspondence.For example, for e-commerce field, mostly what need translation is name of product, description, details etc., in order to guarantee the accuracy of translation result, generally can distinguish according to product word, qualifier, attribute word, brand word, model word, specification word, sales promotion word etc. the classification of each word.Be appreciated that also and can comprise name, mechanism, place name, and other proper nouns etc.; According to the difference of business, can also define other different classifications, as cyberspeak, video display, game, books etc.; In language, the vocabulary such as conventional preposition and conjunction also can be used as independently classification, more accurately word classified and to identify.
Suppose that the classification that classified dictionary comprises has product word, qualifier, attribute word, brand word, model word, specification word, sales promotion word etc.Now, cutting is obtained inquiring about in three word substitution classified dictionaries, can determine that " leather case ", " for ", " mobile phone " are respectively product word, preposition and product word.For the word of identical category, can according to the sequencing label occurring, for example, use English A, B, C, D ..., or with digital 1,2,3,4 ... etc. mode.In the present embodiment, supposing to adopt English to number, is so just product word A and product word B respectively.
Step 103 is searched the semantic pattern of determining described source statement in semantic pattern database according to the classification of described each word of source statement.
According to the classification of each word of source statement, in semantic pattern database, search and determine that the semantic pattern of described source statement comprises:
D1, determines the classification combination of source statement;
D2, combines the classification obtaining the semantic pattern of searching coupling in the database of substitution semantic pattern respectively, if can find, obtains described semantic pattern;
D3, the categorical measure in the corresponding classification combination of semantic pattern of relatively more described each coupling, chooses the classification that categorical measure is maximum and combines the semantic pattern that corresponding semantic pattern is source statement.
The classification combination of source statement is the combination of each word classification in this source statement, and according to mankind's reading habit from left to right, this classification combination can be the combination that first classification of left number plays order and other classification.Wherein, the classification combination of determining source statement comprises: if categorical measure N is 2, classification is combined as one; If categorical measure N>2, classification combination one is total N-1, comprises that the first two classification starting from first classification is first category combination; First three classification starting from first classification is the second classification combination, by that analogy, until the N starting from first a classification classification is the combination of N-1 classification.For example, categorical measure is 5, has 4 classification combinations, comprises respectively: the combination of the combination of first and second classifications, first to the 3rd classification, the combination of first to the 4th classification, and the combination of first to the 5th classification.
For example, take aforementioned " leather case for mobile phone " be example, definite classification is respectively product word, preposition, product word, so corresponding classification is combined as " product word preposition " and " product word preposition product word " two.Be appreciated that in order to express clearly, before adopt Chinese to represent classification combination, if represent it can is " product for " and " product for product " two classifications combinations with English.
Semantic pattern refers to the classification gang form that can form a complete sentence or phrase prestoring.Take aforementioned source statement as example, and according to syntax rule, the combination of product word and two classifications of preposition can not be generally a complete sentence or phrase, and therefore, this classification combination is not semantic pattern.And the combination of product word, preposition and three classifications of product word can be combined into a phrase, therefore, this classification combination is a semantic pattern.For the method for aforementioned description, may occur in semantic pattern database, finding the semantic pattern of mating completely with source statement.Also may there is the semantic pattern of mating completely with source statement in semantic pattern data, in source statement, only have the classification of part to combine the semantic pattern that can find coupling.Now, directly the semantic pattern using this semantic pattern as source statement, does not have the classification combination of coupling to process according to conventional interpretative system in source statement.
Preferably, can also be handled as follows:
First will in source statement, can find the classification of the semantic pattern of coupling to combine as independent sector.Then whole classifications using remaining classification as new source statement, according to abovementioned steps D1, to the mode of D3, again determine again the semantic pattern of new source statement, if can determine, the semantic pattern using new semantic pattern and semantic pattern common combination corresponding to aforementioned independent sector as source statement; If can not determine, the semantic pattern using semantic pattern corresponding to aforesaid independent sector as source statement, remaining part is translated according to conventional interpretative system; If remain part, can determine, again new source statement be split, can determine that the part of semantic pattern is as independent sector, remaining part repeats aforementioned other disposal route of residue class to be operated.
Step 104, searches translation rule corresponding to described semantic pattern, according to described translation rule, source statement is translated.
If determine source statement, only comprise a semantic pattern, can be directly according to translation rule corresponding to this semantic pattern of inquiry, and according to translation rule, source statement is translated.
If when determining source statement and comprise at least two semantic patterns, can obtain respectively translation rule corresponding to each semantic pattern, the each several part of source statement is translated according to translation rule corresponding to each semantic pattern respectively, finally directly the translation result of at least two semantic patterns is combined to the final translation result that obtains source statement.Be appreciated that also and can set the translation rule between different semantic patterns, now can obtain in the following way translation result:
Determine at least two semantic patterns that described source statement comprises;
Based on the corresponding translation rule of described at least two semantic patterns, each self-corresponding classification combination is translated and obtained part translation result;
Translation rule described in inquiry between at least two semantic patterns, adjusts part translation result according to described translation rule, obtains the translation result of final source statement.
Be appreciated that, semantic pattern in semantic pattern database and corresponding translation rule or the translation rule between semantic pattern can adopt the mode of direct correlation to store, after determining semantic pattern, in semantic pattern database, search the translation rule associated with described semantic pattern or translation rule.
Also the translation rule between semantic pattern or semantic pattern can be stored separately and is numbered, in semantic pattern database, store the numbering of translation rule corresponding between each semantic pattern or semantic pattern.After definite semantic pattern or semantic pattern combination, obtain the numbering of translation rule, by numbering, remove to inquire about corresponding translation rule.By this kind, semantic pattern and translation rule are separated to the mode of storage, can be so that the maintenance of follow-up data and processing reduce system overhead.
For the method for aforementioned description, may occur in semantic pattern database, finding the semantic pattern of mating completely with source statement, now, can directly according to translation rule corresponding to this semantic pattern, to source statement, translate.Also may there is the semantic pattern of mating completely with source statement in semantic pattern database, in source statement, only have the classification of part to combine the semantic pattern that can find coupling.Now, directly the semantic pattern using this semantic pattern as source statement, does not have the classification combination of coupling to process according to conventional interpretative system in source statement.
Preferably, can also be handled as follows:
First will in source statement, can find the classification of the semantic pattern of coupling to combine as independent sector.Then whole classifications using remaining classification as new source statement, according to abovementioned steps D1, to the mode of D3, again determine again the semantic pattern of new source statement, if can determine, the semantic pattern using new semantic pattern and semantic pattern common combination corresponding to aforementioned independent sector as source statement, and source statement is split as to two independently parts, according to the corresponding translation rule of each self-corresponding semantic pattern, translate respectively; If can not determine, the semantic pattern using semantic pattern corresponding to aforesaid independent sector as source statement, independent sector before source statement is translated according to the corresponding translation rule of corresponding semantic pattern, and remaining part is translated according to conventional interpretative system; If remaining part can determine, again new source statement is split, can determine that the part of semantic pattern, as independent sector, translates according to the translation rule that semantic pattern is corresponding, remaining part repeats aforementioned other disposal route of residue class to be operated.
Method by this kind of combination semantic pattern is translated source statement, can make as much as possible the each several part of source statement translate according to translation rule corresponding to semantic pattern, thereby guaranteed translation quality, the user that can avoid causing because translation result is inaccurate repeatedly changes source statement and inquires about, thereby reduces the burden of system.
Below in conjunction with instantiation, aforementioned process is described in detail.
Take English Translation as Chinese be example: suppose that source statement is as " leather case for mobile phone "; After cutting, obtain " leather case ", " for ", " mobile phone " three words; After inquiry classified dictionary, determine that the classification of three words is respectively " product ", " for ", " product ", because there is identical category, so need to distinguish identical category, is respectively " productA " and " productB ".
Now, because the categorical measure of source statement is three, therefore determine that the classification combination of source statement has two, be respectively " productA for " and " productA for productB ".
To in two classification combination substitution semantic pattern storehouses, inquire about, obtaining " productA for " does not have corresponding semantic pattern, and " productA for productB " has the semantic pattern of coupling.The semantic pattern of source statement is " productA for productB " so.
Suppose that translation rule corresponding to semantic pattern " productA for productB " finding is: first translation " productA " is " product word A ", translating " productB " is " product word B " again, finally returns to the translation result of " product word B product word A " conduct " productA for productB " ".
The process of according to this rule, " leather case for mobile phone " being translated is so: first translated product word " leather case " is " leather sheath ", translated product word " mobile phone " is " mobile phone " again, and then be combined into " Leather cover for handset ", obtained translation result.
And for example, take translator of Chinese as English be example, suppose that the word classification after source statement cutting is: " qualifier A ", " qualifier B " " product word C ", " qualifier D ", " qualifier E ", " product word F ",
Now, the classification combination that can determine source statement has five, is respectively " qualifier A qualifier B ", " qualifier A qualifier B product word C ", " qualifier A qualifier B product word C qualifier D ", " qualifier A qualifier B product word C qualifier D qualifier E ", " qualifier A qualifier B product word C qualifier D qualifier E product word F ".
By inquiring about and obtain in five classification combination substitution semantic pattern storehouses, only have " qualifier A qualifier B product word C " to have the semantic pattern of coupling, all the other four all do not have.So, first by " qualifier A qualifier B product word C " as independent sector, according to translation rule corresponding to this semantic pattern, translate and obtain translation result 1.Then remaining " qualifier D qualifier E product word F " processed as new source statement, obtain two categories and do not combine, be respectively " qualifier D qualifier E " and " qualifier D qualifier E product word F ".Inquiry obtains the semantic pattern that " qualifier D qualifier E product word F " has coupling, can translate and obtain translation result 2 according to translation rule corresponding to this semantic pattern so.
Now, if do not define in advance two translation rules between semantic pattern, so just can directly two translation result combinations be obtained to " translation result 1 translation result 2 " as the final translation result of source statement.If pre-defined the translation rule between two semantic patterns, for example the translation rule for " semantic pattern 1 semantic pattern 2 " is: " translation result of the translation result for semantic pattern 1 of semantic pattern 2 ", so final translation result is " translation result 2for translation result 1 ".
With reference to Fig. 2, the application's machine translation data processing equipment is shown, comprise data acquisition module 10, classification determination module 20, semantic pattern determination module 30 and translation module 40.
Data acquisition module 10, for receiving source statement to be translated, carries out cutting to described source statement.
Classification determination module 20, inquires about at classified dictionary for the word that cutting is obtained, and determines the classification of each word.
Semantic pattern determination module 30, for searching the semantic pattern of determining described source statement at semantic pattern database according to the classification of described each word of source statement.Preferably, semantic pattern determination module comprises classification combination determining unit, semantic pattern matching unit and relatively chooses unit.Classification combination determining unit, for determining the classification combination of source statement.Semantic pattern matching unit, for the classification obtaining being combined to the semantic pattern that the database of substitution semantic pattern is respectively searched coupling, if can find, obtains described semantic pattern.Relatively choose unit, for the categorical measure of the corresponding classification combination of the relatively more described semantic pattern that each mates, choose the classification that categorical measure is maximum and combine the semantic pattern that corresponding semantic pattern is source statement.
Be appreciated that, relatively choose unit and also comprise judgment sub-unit, for judging whether the classification combination that described categorical measure is maximum is whole word classifications of source statement, if so, chooses the classification that described categorical measure is maximum and combines the semantic pattern that corresponding semantic pattern is source statement;
If not, judge whether the remaining word classification combination of source statement has corresponding semantic pattern, if, obtain its semantic pattern, and combine corresponding semantic pattern jointly as the semantic pattern of source statement with the maximum classification of described categorical measure, if not, the maximum classification of described categorical measure is combined to corresponding semantic pattern as the semantic pattern of source statement.
Translation module 40, for searching translation rule corresponding to described semantic pattern, translates source statement according to described translation rule.If the semantic pattern of source statement is one, can directly according to translation rule corresponding to semantic pattern, to source statement, translates and obtain translation result so.If the combination that the semantic pattern of source statement is at least two semantic patterns, translation module comprises translation unit and assembled unit so.Wherein, translation unit, for translating and obtain part translation result source statement corresponding part according to the corresponding translation rule of each semantic pattern.Assembled unit, for obtaining described part translation result combination the final translation result of source statement; Or according to the translation rule between semantic pattern, described part translation result is adjusted, obtain the final translation result of source statement.
With reference to Fig. 3, a kind of terminal of the application is shown, comprise aforesaid machine translation data processing equipment.This terminal can be client, can be also server end.Be that aforesaid machine translation data processing equipment can directly be installed in the client of user's use, also can be installed on server end.In the present embodiment, take and be installed on server end and describe as example.User sends translation source statement by network to this machine translation data processing equipment, and machine translation data processing equipment returns to translation result to user by network after translating.
The application's machine translation data processing method and terminal, by source statement is split, determine according to the classification of each word the semantic pattern that it is final, then according to translation rule corresponding to semantic pattern, source statement are translated.The semantic pattern of source statement can be mated completely with source statement, can be also part coupling, even if therefore source statement is not standard sentence formula, the result that also can partly mate is translated, and has guaranteed translation quality.In system, only need to safeguard a small amount of semantic pattern and translation rule, without a large amount of example statement of storage, thereby can reduce data to the taking of system resource, simultaneously because the minimizing of storage data volume can improve search efficiency and system response time.
Secondly, source statement for multiple semantic pattern combination can also additionally arrange the translation rule between semantic pattern, for the source statement of non-standard sentence formula, also can guarantee the accuracy of translation, thereby can improve the scope of application of system, avoid the system load causing because user repeatedly inquires about to increase.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, and each embodiment stresses is the difference with other embodiment, between each embodiment identical similar part mutually referring to.For device embodiment, because it is substantially similar to embodiment of the method, so description is fairly simple, relevant part is referring to the part explanation of embodiment of the method.
The application is with reference to describing according to process flow diagram and/or the block scheme of the method for the embodiment of the present application, equipment (device) and computer program.Should understand can be in computer program instructions realization flow figure and/or block scheme each flow process and/or the flow process in square frame and process flow diagram and/or block scheme and/or the combination of square frame.Can provide these computer program instructions to the processor of multi-purpose computer, special purpose computer, Embedded Processor or other programmable data processing device to produce a machine, the instruction of carrying out by the processor of computing machine or other programmable data processing device is produced for realizing the device in the function of flow process of process flow diagram or a plurality of flow process and/or square frame of block scheme or a plurality of square frame appointments.
These computer program instructions also can be stored in energy vectoring computer or the computer-readable memory of other programmable data processing device with ad hoc fashion work, the instruction that makes to be stored in this computer-readable memory produces the manufacture that comprises command device, and this command device is realized the function of appointment in flow process of process flow diagram or a plurality of flow process and/or square frame of block scheme or a plurality of square frame.
These computer program instructions also can be loaded in computing machine or other programmable data processing device, make to carry out sequence of operations step to produce computer implemented processing on computing machine or other programmable devices, thereby the instruction of carrying out is provided for realizing the step of the function of appointment in flow process of process flow diagram or a plurality of flow process and/or square frame of block scheme or a plurality of square frame on computing machine or other programmable devices.
The machine translation data processing method and the device that above the application are provided are described in detail, applied specific case herein the application's principle and embodiment are set forth, the explanation of above embodiment is just for helping to understand the application's method and core concept thereof; Meanwhile, for one of ordinary skill in the art, the thought according to the application, all will change in specific embodiments and applications, and in sum, this description should not be construed as the restriction to the application.

Claims (9)

1. a machine translation data processing method, is characterized in that, comprises the following steps:
Receive source statement to be translated, described source statement is carried out to cutting;
The word that cutting is obtained is inquired about in classified dictionary, determines the classification of each word;
According to the classification of described each word of source statement, in semantic pattern database, search the semantic pattern of determining described source statement;
Search translation rule corresponding to described semantic pattern, according to described translation rule, source statement is translated.
2. machine translation data processing method as claimed in claim 1, is characterized in that, described searching in semantic pattern database according to the classification of described each word of source statement determines that the semantic pattern of described source statement comprises:
Determine the classification combination of source statement;
The classification obtaining is combined to the semantic pattern of searching coupling in the database of substitution semantic pattern respectively, if can find, obtain described semantic pattern;
Categorical measure in the corresponding classification combination of semantic pattern of relatively more described each coupling, chooses the classification that categorical measure is maximum and combines the semantic pattern that corresponding semantic pattern is source statement.
3. machine translation data processing method as claimed in claim 2, is characterized in that, described in choose classification that categorical measure is maximum to combine corresponding semantic pattern be that the semantic pattern of source statement comprises:
Judge whether the classification combination that described categorical measure is maximum is whole word classifications of source statement, if so, chooses the classification that described categorical measure is maximum and combines the semantic pattern that corresponding semantic pattern is source statement;
If not, judge whether the remaining word classification combination of source statement has corresponding semantic pattern, if, obtain its semantic pattern, and combine corresponding semantic pattern jointly as the semantic pattern of source statement with the maximum classification of described categorical measure, if not, the maximum classification of described categorical measure is combined to corresponding semantic pattern as the semantic pattern of source statement.
4. machine translation data processing method as claimed in claim 2 or claim 3, is characterized in that, the classification combination of described definite source statement comprises:
If categorical measure N is 2, classification is combined as one;
If categorical measure N>2, classification combination one is total N-1, comprises that the first two classification starting from first classification is first category combination; First three classification starting from first classification is the second classification combination, until the N starting from first a classification classification is the combination of N-1 classification.
5. machine translation data processing method as claimed in claim 1, is characterized in that, if the combination that the semantic pattern of source statement is at least two semantic patterns is describedly translated and is comprised source statement according to described translation rule:
According to the corresponding translation rule of each semantic pattern, corresponding part in source statement is translated and obtained part translation result, described part translation result combination is obtained to the final translation result of source statement; Or
According to the corresponding translation rule of each semantic pattern, corresponding part in source statement is translated and obtained part translation result, obtain the translation rule between each semantic pattern, according to described translation rule, part translation result is adjusted, obtained the final translation result of source statement.
6. a machine translation data processing equipment, is characterized in that, comprising:
Data acquisition module, for receiving source statement to be translated, carries out cutting to described source statement;
Classification determination module, inquires about at classified dictionary for the word that cutting is obtained, and determines the classification of each word;
Semantic pattern determination module, for searching the semantic pattern of determining described source statement at semantic pattern database according to the classification of described each word of source statement;
Translation module, for searching translation rule corresponding to described semantic pattern, translates source statement according to described translation rule.
7. machine translation data processing equipment as claimed in claim 6, is characterized in that, described semantic pattern determination module comprises:
Classification combination determining unit, for determining the classification combination of source statement;
Semantic pattern matching unit, for the classification obtaining being combined to the semantic pattern that the database of substitution semantic pattern is respectively searched coupling, if can find, obtains described semantic pattern;
Relatively choose unit, for the categorical measure of the corresponding classification combination of the relatively more described semantic pattern that each mates, choose the classification that categorical measure is maximum and combine the semantic pattern that corresponding semantic pattern is source statement.
8. machine translation data processing equipment as claimed in claim 7, is characterized in that, the described unit of relatively choosing comprises:
Judgment sub-unit, for judging whether the classification combination that described categorical measure is maximum is whole word classifications of source statement, if so, chooses the classification that described categorical measure is maximum and combines the semantic pattern that corresponding semantic pattern is source statement;
If not, judge whether the remaining word classification combination of source statement has corresponding semantic pattern, if, obtain its semantic pattern, and combine corresponding semantic pattern jointly as the semantic pattern of source statement with the maximum classification of described categorical measure, if not, the maximum classification of described categorical measure is combined to corresponding semantic pattern as the semantic pattern of source statement.
9. machine translation data processing equipment as claimed in claim 6, is characterized in that, if the combination that the semantic pattern of source statement is at least two semantic patterns, described translation module comprises:
Translation unit, for translating and obtain part translation result source statement corresponding part according to the corresponding translation rule of each semantic pattern;
Assembled unit, for obtaining described part translation result combination the final translation result of source statement; Or according to the translation rule between semantic pattern, described part translation result is adjusted, obtain the final translation result of source statement.
CN201210285384.7A 2012-08-10 2012-08-10 Computer translation data processing method and computer translation data processing device Pending CN103577397A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210285384.7A CN103577397A (en) 2012-08-10 2012-08-10 Computer translation data processing method and computer translation data processing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210285384.7A CN103577397A (en) 2012-08-10 2012-08-10 Computer translation data processing method and computer translation data processing device

Publications (1)

Publication Number Publication Date
CN103577397A true CN103577397A (en) 2014-02-12

Family

ID=50049206

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210285384.7A Pending CN103577397A (en) 2012-08-10 2012-08-10 Computer translation data processing method and computer translation data processing device

Country Status (1)

Country Link
CN (1) CN103577397A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664545A (en) * 2018-03-26 2018-10-16 商洛学院 A kind of translation science commonly uses data processing method
CN109426667A (en) * 2017-08-21 2019-03-05 阿里巴巴集团控股有限公司 The interpretation method and device of air ticket classification CAT rule
CN109684448A (en) * 2018-12-17 2019-04-26 北京北大软件工程股份有限公司 A kind of intelligent answer method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001082111A2 (en) * 2000-04-24 2001-11-01 Microsoft Corporation Computer-aided reading system and method with cross-language reading wizard
US20070100601A1 (en) * 2005-10-27 2007-05-03 Kabushiki Kaisha Toshiba Apparatus, method and computer program product for optimum translation based on semantic relation between words
CN101877189A (en) * 2010-05-31 2010-11-03 张红光 Machine translation method from Chinese text to sign language
CN102117270A (en) * 2011-03-29 2011-07-06 中国科学院自动化研究所 Statistical machine translation method based on fuzzy tree-to-accurate tree rule
CN102135957A (en) * 2010-01-22 2011-07-27 阿里巴巴集团控股有限公司 Clause translating method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001082111A2 (en) * 2000-04-24 2001-11-01 Microsoft Corporation Computer-aided reading system and method with cross-language reading wizard
US20070100601A1 (en) * 2005-10-27 2007-05-03 Kabushiki Kaisha Toshiba Apparatus, method and computer program product for optimum translation based on semantic relation between words
CN102135957A (en) * 2010-01-22 2011-07-27 阿里巴巴集团控股有限公司 Clause translating method and device
CN101877189A (en) * 2010-05-31 2010-11-03 张红光 Machine translation method from Chinese text to sign language
CN102117270A (en) * 2011-03-29 2011-07-06 中国科学院自动化研究所 Statistical machine translation method based on fuzzy tree-to-accurate tree rule

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109426667A (en) * 2017-08-21 2019-03-05 阿里巴巴集团控股有限公司 The interpretation method and device of air ticket classification CAT rule
CN108664545A (en) * 2018-03-26 2018-10-16 商洛学院 A kind of translation science commonly uses data processing method
CN109684448A (en) * 2018-12-17 2019-04-26 北京北大软件工程股份有限公司 A kind of intelligent answer method
CN109684448B (en) * 2018-12-17 2021-01-12 北京北大软件工程股份有限公司 Intelligent question and answer method

Similar Documents

Publication Publication Date Title
US10997370B2 (en) Hybrid classifier for assigning natural language processing (NLP) inputs to domains in real-time
US10073840B2 (en) Unsupervised relation detection model training
WO2015135455A1 (en) Natural language question answering method and apparatus
US8027948B2 (en) Method and system for generating an ontology
US7742922B2 (en) Speech interface for search engines
US20090089047A1 (en) Natural Language Hypernym Weighting For Word Sense Disambiguation
CN101697109A (en) Method and system for acquiring candidates of input method
CN104657439A (en) Generation system and method for structured query sentence used for precise retrieval of natural language
CN101464897A (en) Word matching and information query method and device
JP5403696B2 (en) Language model generation apparatus, method and program thereof
US8402046B2 (en) Conceptual reverse query expander
CN105045852A (en) Full-text search engine system for teaching resources
MXPA04001729A (en) Methods and systems for language translation.
JP2020191075A (en) Recommendation of web apis and associated endpoints
JP2015225657A (en) Interactive searching method and apparatus
JP2015060243A (en) Search device, search method, and program
CN105677725A (en) Preset parsing method for tourism vertical search engine
Ell et al. SPARQL query verbalization for explaining semantic search engine queries
KR101654717B1 (en) Method for producing structured query based on knowledge database and apparatus for the same
JP2005250980A (en) Document retrieval system, retrieval condition input device, retrieval execution device, document retrieval method and document retrieval program
US20120179709A1 (en) Apparatus, method and program product for searching document
US20220365956A1 (en) Method and apparatus for generating patent summary information, and electronic device and medium
CN112825111A (en) Natural language processing method and computing device thereof
CN103577397A (en) Computer translation data processing method and computer translation data processing device
JP2008077252A (en) Document ranking method, document retrieval method, document ranking device, document retrieval device, and recording medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20140212

RJ01 Rejection of invention patent application after publication