CN1120439C - Chinese forming device for machine translation - Google Patents

Chinese forming device for machine translation Download PDF

Info

Publication number
CN1120439C
CN1120439C CN 96112514 CN96112514A CN1120439C CN 1120439 C CN1120439 C CN 1120439C CN 96112514 CN96112514 CN 96112514 CN 96112514 A CN96112514 A CN 96112514A CN 1120439 C CN1120439 C CN 1120439C
Authority
CN
China
Prior art keywords
sentence
chinese
key element
verb
statement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 96112514
Other languages
Chinese (zh)
Other versions
CN1156287A (en
Inventor
郭俊桔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Holdings Corp
Original Assignee
Matsushita Electric Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co Ltd filed Critical Matsushita Electric Industrial Co Ltd
Publication of CN1156287A publication Critical patent/CN1156287A/en
Application granted granted Critical
Publication of CN1120439C publication Critical patent/CN1120439C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

To determine a fundamental sentence pattern of Chinese and to reduce the number of generative syntactic rules of Chinese-language sentences by providing a fundamental sentence pattern part where the Chinese-language fundamental sentence pattern structure is registered and a syntactic element sequence part where slot positions and restrictions on the sequence of syntactic elements corresponding to them are registered. The dependence structure of Chinese corresponding to Japanese is inputted from an input part 100 and is sent to a preprocessing part 200 and is processed. When the case mark indicates a demonstrative case, the subject in the dependence structure is not omitted, and it is sent to a fundamental element expansion part 300 and is processed. A word as a free element is found from the dependence structure by a free element expansion part 400. If there are two candidates for the case mark of this node, the case mark of the free element, the Japanese-language surface symbol, the semantic symbol, and the semantic dominant code are taken as the retrieval key to refer to a sentence element information part 450, and probable candidates are detected by matching operation to find out the optimum candidate. Finally, a syntactic element sequence part 650 is referred to adjust the arrangement sequence of elements.

Description

The Chinese generating apparatus of mechanical translation
Technical field
The present invention is relevant with mechanical translation, and is particularly relevant with the Chinese generating apparatus that utilizes the civilian information of Wen Yiyu structure to carry out mechanical translation.
Background technology
(in turning over following just day, the definition of the machine translation mothod term of (in turn over day) and the syntax of Chinese etc. describe)
The present patent application is relevant with the Chinese generating apparatus that mechanical translation is used, but now in Japan, Chinese is not very familiar to, and this technology of mechanical translation is a special technique field.Therefore, before relevant prior art of explanation the present patent application and embodiment etc., be necessary under Min., comprise relevant field indirectly, illustrate in turning over the day that relates to connection document and purport of the present invention, the mechanical translation of (in turn over day), and the meaning of term or definition etc. are described.(thereby, strictly speaking,, not the description of pure " prior art " also relevant for the description of purport of the present invention.)
(connection document)
1. about Chinese
Phase Pu slow-witted work " the ABC of Chinese " NHK proceedings etc.
2. about mechanical translation
Herd wild force and then show the publication of " mechanical translation introduction " ohm company
Long-tail Allah compiles " mechanical translation " ohm company and publishes
3. the mechanical translation of (in turn over day) in turning over about day
(1) spy opens " mechanical translation from the Japanese to Chinese " flat 3-102568 number
(2) spy opens flat 3-20295 number " machine translation apparatus ", in addition, also has the spy to open clear 61-077639 number etc.
4. the meaning of term or definition etc.
Tree structure in syntax analysis, the phonetic structure etc., various node:
In tree structure, each unit, that is a unit is exactly a node.In addition, when also connecting other node under the node, the node on upper strata is called father node so, and lower level node is called child node.
Below, various " nodes " that the present invention uses are described
The verb node refers to have the node of verb attribute.
The adjective node refers to have the node of adjective attribute.
Leaf node, the node that refers to not have child node.
Subject attribute zero node refers to have the zero node of subject attribute.
So-called zero node is the node that adds for the ease of processing.
In addition, if the attribute of the lattice of handled node is the subject attribute, then be called the subject attribute node.
Secondly, to tree structure, handle by its character " from top to bottom ", " from left to right ".
For example, the tree structure shown in Figure 16 is pressed d, e, and b, f, c, the order of a is handled.
In addition, in the present invention, be conceived to verb and adjective and in the grammer of Chinese and syntax analysis, play an important role.And, at this moment, tree structure is used for analyzing.
Dependency structure:
Finger is constituting key element (argument) between each key element (form elements) of statement and the relation between the modifier (modifier).In the figure of dependency structure, under key element, dispose modifier usually, know the two relation of expression according to case markers such as the nominative of modifier, objective case, place lattice.For example, the sentence structure of " he catches a cold " is shown in Figure 17 (a), and its dependency structure is shown among 17 (b).
By this figure as can be known, owing to only the node table of necessity is shown in the dependency structure, thereby simple in structure, when carrying out intermediate conversion when handling, as long as the input dependency structure, this handles required necessary regular number just seldom, and editor etc. also is easy to.For this reason, in the machine translation system of change type, its input, export structure generally are exactly " dependency structure ".(and Figure 11) referring to " technology that has earlier " described later
Mother tongue top layer symbol in the top layer symbol, particularly the present invention, beginning of the sentence top layer symbol, the processing of sentence tail top layer symbol: the auxiliary word of Japanese is a kind of function word.In other words, though look like identical auxiliary word,, according to the statement difference, its function is also different.For example, for " て ", the instrumental (case) of existing " hand In つ か む " etc. also has the place lattice of " Osaka て can う " etc.Therefore, " て " called top layer symbol of doing, the meaning of " て " in the statement then is called the deep layer symbol.In other words, a top layer symbol can have different connotations.Therefore, when carrying out the processing of natural languages such as Japanese, must determine the real meaning of top layer symbols such as " て ".
" beginning of the sentence top layer symbol " refers to the preposition of Chinese in the present invention.Usually, be placed on the beginning of the sentence of preposition words and phrases, therefore, be called beginning of the sentence top layer symbol.
" sentence tail top layer symbol ": in Chinese, if when certain noun has the place lattice, its place lattice can according to such as " lining ", " on ", D score, " left side ", " right side " wait and represent various places lattice.And these words are placed in the sentence tail of preposition words and phrases usually.Therefore, claim among the present invention that they are civilian tail top layer symbol.
Position, space and the position, space of the free element that Chinese generates:
So-called space is exactly the meaning of hole.Statement (sentence pattern) to Chinese is analyzed, and can be divided into fundamental and free key element.After the position that has determined fundamentals such as subject, verb, between fundamental and fundamental, the space that is called the space just can be set, and these spaces are exactly the position of placing free key element.If the SVO sentence pattern that constitutes with subject, verb and object is an example, the position in four spaces of (1)~(4) shown in following can be arranged.
(1)S(2)V(3)O(4)
In the present invention, also be conceived to the formation rule in these spaces.
Conjunction qualifier and conjunction:
They are equivalent to the auxiliary word of Japanese.The syntax that constitute for example Japanese of " Japanese fine れ ", " between rattan " etc. of another word with a plurality of words use the conjunction of speech different.Can be referring to above-mentioned " spy opens flat 3-102568 number " or " spy opens clear 61-077639 number "
Morphology factor:
It is a smallest meaningful unit.Word or words such as " teachers " such as " he " in the Chinese for example.
Adverb of time:
It is adverbial word or the noun with timeliness." today " " yesterday " etc. for example.
The auxiliary verb of Chinese:
Speech such as " energy ", " can ", " will " for example.The auxiliary verb of its grammer effect and English is identical.Among the present invention, the meaning of the auxiliary verb in the sentence of Japanese (sentence pattern) and it as search key, is retrieved corresponding Chinese auxiliary verb with reference to " day Chinese auxiliary verb table of comparisons ".For example, the meaning of the auxiliary verb of Japanese " い " is " hope ".Thereby, be exactly " thinking " corresponding to the Chinese auxiliary verb of " い ".In addition, " think " that the position in sentence pattern is space 2.
Speech continues:
In Chinese they be " with ", " or ", " reaching " etc., respectively with " と, and ", " か, but ", " と, and " is suitable.
Verb and adjective:
It is different details such as applying flexibly, but the meaning of Japanese and Chinese is roughly the same.Indication synonym and numerative etc. are too.
Title (TOPIC)
In Chinese, when carrying out lay special stress on, certain word or certain word are placed on the beginning of sentence.This word or speech are called " title ".
The property value of the special sentence pattern in the Chinese:
Chinese has many special sentence patterns.These special sentence patterns are decided by the key element that constitutes sentence.For example " " sentence, " quilt " sentence, " making " etc.As to " " sentence is illustrated, in Chinese, if transitive verb is when having two objects (direct object, indirect object), nature can form " " sentence.For example " book is placed in the car " just more natural than " putting book in car ".(in aftermentioned embodiment, store above-mentioned which kind of sentence pattern of generation according to the attribute SENATTR of verb.)
" quilt " sentence in the Chinese, " " sentence:
In Chinese, " quilt " sentence typically refers to passive sentence." passive sentence " in Here it is the Japanese or " terrible passive sentence ".In the Chinese, " " usually use dual transitive verb in the sentence, and have direct object and indirect object.Generally before direct object, place " " word, and it is moved to before the verb.
The present invention is conceived to the rule of above-mentioned special sentence pattern.
Dual transitive verb:
Ditransitive verb in the Chinese is exactly dual transitive verb (transitive verb that can have two objects) in the English and the similar verb suitable with " gave " among the Hega Vemeacar (SVOO).
Causative sentence:
Make other people carry out the statement of the content of certain action.In addition, in Chinese, can generate causative sentence with " giving " word.
For example: teacher has a meal " for " him.(Mr. allows him have a meal)
Passive sentence:
I.e. Ri Ben " passive literary composition ".Usually represent with " quilt " word in the Chinese.
The predicative of Chinese:
The purpose language that is equivalent to Japanese.
(original prior art)
Original this speech of following handle is said it in another way, and the pure technology that has earlier is described.
Today in the rapid progress of technology,, must constantly absorb knowledge and information in order to keep abreast with the epoch.But on traffic and the very flourishing basis of communicating by letter, the knowledge that should absorb never only is confined to domestic, also from external a large amount of inputs.Under these circumstances, majority are so unconsummate to foreign language, and the foreign language kind as information source is a lot of in addition, and translation just becomes very important.And, in order to improve the efficient of translation, its quality and speed are improved, will change handwork by machine and be undertaken, that is to say, entered the epoch of necessary consideration machine translation system.But with regard to the mode of mechanical translation, the characteristic according to the language translation can be divided into direct mode, intermediate conversion mode, kernel language mode (PIVOT) etc.Wherein, as what illustrated, in order just to achieve the goal with less transformation rule, usually adopt " intermediate conversion modes " more.
Adopt the machine translation apparatus of above-mentioned intermediate conversion mode, as shown in figure 13, roughly constitute by following four parts:
(1) source language (be transfused to and be translated into fremdsprachig language) analysis portion
(2) intermediate structure converter section
(3) object language (language that is translated and exports) generation unit
(4) with reference to using dictionary, dictionary
In general, should simplify the data structure of source language, making the information translation that all can handle is a kind of simple intermediate structure, for example dependency structure (dependencystructure), and then acquisition object language.
But,, thereby no longer be illustrated because this content is for example to open the well-known technology of disclosed what is called in flat 3-202954 number above-mentioned spy in addition by the applicant.
Under these circumstances, the quality quality of mechanical translation, as us a part had been described, they are determined by following factor: promptly how the statement that source language analysis portion is imported is correctly analyzed, be how to remove difference between source language (in this instructions for Japanese) and the object language (being Chinese in this instructions) in the intermediate structure converter section, (for example, remove the difference between structure sentence and the semanteme, perhaps correctly select to translate language etc.), and how to follow the generating grammar rule of object language at the object language generation unit, correctly generate object language etc.
Especially in Chinese, speech residing position in sentence is most important, and translating equipment is difficult to hold them, based on this specific character, is that bad relation is very big to above-mentioned various processing.
That is to say, if the position difference of certain word in sentence, just the meaning of this sentence could be far from each other.For example, the position difference of " on desk " this phrase in sentence, " (he just jumps on desk he * in * jumping on the desk.) " and " he jumps *, and (he has jumped on the desk at * on the desk.) ", its meaning is entirely different.
In addition, certain word (if Chinese is Chinese character in principle), word is necessarily arranged in proper order by certain, if not like this, just can form a grammatically wrong sentence.For example, as following described, the time word (word of express time.Below " place speech " " instrument speech " etc. also used by similar meaning.But grammatical terms such as " subjects " is not limit by this certainly.) must be placed on before the place speech in expression place.
Correct Chinese sentence: he * * yesterday * has a meal at the * of school.(he ate meal in school yesterday.)
The Chinese sentence of mistake: he * has a meal at * * * yesterday of school.
But sometimes but in contrast, its order of arrangement of certain word specific or that determine, word is but very free.For example, as follows, all can in the front and back that time word is placed on subject.
Time word is put in before the subject: yesterday, he went to school.(he went to school yesterday.)
Time word is put in after the subject: he went to school yesterday.
Instrument speech+place speech: his * fights at social * by the strength * of company, and (he relies on the strength of company to be struggled socially.)
Place speech+instrument speech: he * fights by the strength * of company at social * *.
From above-mentioned example as can be known, concerning object language was the mechanical translation of Chinese, the word in the sentence, the arrangement of speech, the decision of sequential scheduling were extremely important problems.
As the translating equipment of existing generation Chinese, for example there is the spy to open the device shown in the flat 3-102568 communique.The formation of this device as shown in figure 14.Among this figure, the 10th, Japanese input device, when utilizing " hiragana " from the Japanese processor, when the pronunciation symbol of " katakana " or " Roman capitals " etc. is imported Japanese, with reference to the Japanese character font file 11 that illustrates later and Japanese vocabulary file 18 and English letters/numeral/lexigraphy font file 15, the pronunciation symbol of input is transformed to the sentence of assumed name and Chinese character mixing.The 11st, the japanese type font file, it has been logined in advance as the character code of key word and and the Japanese character (Chinese character assumed name) of this code correspondence.The 15th, English letters/numeral/symbol font file, it is logined in advance as the literal code of search key and and English alphabet, numeral, the symbol of its correspondence.The 18th, the Japanese vocabulary file, login in advance has the word and the speech of Chinese character character code as search key, kana text code and the Japanese corresponding with them.The 20th, translation is selected learning device, to the Japanese character of having imported, press syllable separately after, the company's of removal qualifier of speaking.The 35th, the Japanese dictionary file, login in advance has the word in the Japanese, the product speech of sentence.The 30th, Japanese product speech adding set to having deleted conjunction japanese sentence afterwards, is a unit with the sentence, with reference to the Japanese dictionary file, adds the product speech.The 40th, product speech scrambling transformation device is according to the Chinese grammar of being stored in this device, to adding the japanese sentence of product speech, the scrambling transformation of product speech.The 55th, the Chinese language knowledge database file, login in advance has Japanese character code, Japanese character, word and Chinese written language code, pronunciation symbol sign indicating number, Chinese written language and the word corresponding with them as search key.The 50th, be the translating equipment of Chinese from translator of Japanese, it is translated as Chinese to the japanese sentence of input with reference to above-mentioned Chinese language knowledge database 55.The 60th, Chinese grammatical investigation apparatus, the Chinese sentence of the translation that the translating equipment each is translated from day is sent here carries out the analysis of structure sentence according to Chinese structure sentence rule shown in Figure 15.And, add Chinese grammatical feature to meeting the sentence of structure sentence rule.For example subject, the adverbial modifier etc.In addition, to unanalyzable Chinese sentence, earlier it is used as Chinese idiom and is preserved.The 70th, Chinese grammer converting means carries out the grammer conversion to the Chinese sentence of analyzing through the structure literary composition, and then exports as Chinese idiom.The 85th, the Chinese text font file, login in advance has character code as search key, pronunciation symbol and the Chinese character corresponding with them, word etc.The 80th, Chinese output unit has been continued to use the Japanese character processor, with reference to Hanzi font file 85 and English letters/numeral/sign character font literal 15, and output Chinese.
Below, the action that example is arranged earlier is described.
Imported " わ は や ん は い か ら ぺ I ん ま In ひ こ う I To つ " with " hiragana " afterwards from the Japanese character processor, use Japanese font file 11 and Japanese vocabulary file 18, English letters/numeral/symbol font file 15, by japanese input device 10, by following form, be transformed to the document of Japanese.
The Hang Machine に of private は Shanghai か ら Beijing ま て Fei ?つ.
After this, entering translation selects learning device to handle.Translation is selected learning device 20, and above-mentioned Japanese character is carried out syllabification, and deletion connects speak qualifier " は " and " To ".Its result becomes:
The Hang Machine of private Shanghai か ら Beijing ま て Fly ?つ.
After this, Japanese product speech adding set uses Japanese dictionary file 35 to each word, adds the product speech.Its result becomes:
The Hang Machine of private Shanghai か ら Beijing ま て Fly ?つ.
Secondly, by product speech scrambling transformation device, with reference to the syntax rule of himself storing, make it with Chinese grammer in the arrangement of product speech match.For example, to being the occasion of " noun+auxiliary word " in the Japanese,, reset and be " auxiliary word+noun " as the Chinese of correspondence.Its result is by shown below.
Noun case adverbial verb noun dverbial auxilary word noun noun verb
The Hang Machine of private か ら Shanghai ま In Bei Jing Fly ?つ.
Then,, as search key,, each morphology factor is retrieved the translation of Chinese, it is replaced morphology factor in the above-mentioned Japanese with reference to Chinese language knowledge database 55 with the morphology factor of Japanese by the translating equipment 50 that is translated into Chinese.Its result is following form:
I have sat from Shanghai to Beijing aircraft
After its examination, by Chinese grammer investigation apparatus 60, follow the Chinese structure literary composition rule shown in Figure 15, each morphology factor of the Chinese in the read statement is added the title (as subject, the adverbial modifier etc.) of grammatical function.Above result is as follows: (grammatical function title) subject place modifier object verb
I have sat from Shanghai to Beijing aircraft
Then, by Chinese grammer converting means 70,, carry out conversion with reference to Japanese of being stored and the grammer difference between the Chinese.
For example, in the occasion of Japanese, must be adjusted into " verb+object " this form in the Chinese for " object (as what both stated, promptly the purpose of Japanese is spoken)+verb ".Carry out after the conversion according to this rule, the result is: (grammatical function title) subject place modifier verb object
I have sat aircraft from Shanghai to Beijing
At last, carry out the processing of Chinese output unit 80.In order on the word processor of Japanese, to come Chinese display, must with reference to Chinese text font file 85, export corresponding Chinese with the Chinese text sign indicating number as search key.When being exported with Chinese, above-mentioned Chinese text sign indicating number becomes:
" I have sat aircraft from Shanghai to Beijing "
But, in this device relevant, exist following problems with prior art.
1. this class device will decide the position of each morphology factor in the Chinese sentence according to the structure sentence rule of Chinese grammar.Therefore, if certain structure literary composition rule is not defined in advance, just can not generate Chinese statement (sentence pattern) corresponding to this rule.The result just can not determine the position of morphology factor.For example, in the general structure sentence shown in Figure 15, because adverbial modifier's (place modifier) is placed between subject and the verb, the situation at above-mentioned example literary composition just can not generate " I have sat aircraft to Beijing from Shanghai " high-quality so Chinese sentence.Will collect one for this reason and can be described as the structure sentence rule of (complete) fully, thereby must drop into huge development cost and man-hour (manpower).
2. when generating Chinese sentence, only utilized structure sentence information.Therefore, when meeting a plurality of structure sentences rule, just be difficult to select the properest translation.Therefore in such situation, also just be difficult to generate high-quality Chinese statement.For example, the adverbial modifier will be according to the classification of the meaning of a word of verb, is placed on sometimes (for example verb " is read " " religion " etc.) before the verb, is placed on again sometimes after the verb (for example verb " put ", " storage " etc.).Therefore, if the data information of not classifying by the meaning of verb just can not generate correct Chinese sentence.Particularly, as the situation in the following bracket.
* book is put " in car ".(book is placed in the car.)
(mistake: book " in car " is put)
* he reads at " in car ".(he reads in car.)
(mistake: he read " in car ")
3. along with the increase of Chinese structure sentence rule number, the time that generates Chinese statement also can be longer, and therefore, the efficient of translation system will reduce.
Summary of the invention
Therefore, wish to realize a kind of Chinese generating apparatus of translation that does not have the problems referred to above.Purpose of the present invention is just in order to solve above-mentioned problem.
In order to solve above-mentioned problem, according to a first aspect of the invention, the Chinese generating apparatus that the mechanical translation that provides a kind of dependency structure of the Chinese that will carry out language analysis, intermediate conversion to the statement of source language such as Japanese and obtain to be converted to Chinese statement is used.It is characterised in that to have following part: the basic sentence patterns memory storage, and it is logined in advance corresponding to each disaggregated classification sign indicating number of Chinese verb and the Chinese basic sentence patterns structure corresponding with this yard (can utilize and the form of reference is stored); Statement element information memory storage, it is logined the case marker of modifier in advance and knows the top layer symbol of source language, meaning domination sign indicating number, meaning sign indicating number and the Chinese beginning of the sentence top layer symbol corresponding with them, sentence tail top layer symbol, element informations such as position, space; Structure sentence key element sequential storage device, this device are logined the sequence limit of the position, space that can place free key element and corresponding with it structure literary composition key element in advance; The sentence structure generating apparatus, the dependency structure of Chinese to input, detect the verb, the adjective that have omitted subject, on the structure of this statement, add after subject attribute zero node, according to verb or adjectival verb disaggregated classification sign indicating number (also being applicable to adjective) as the key element in the dependency structure, with reference to above-mentioned basic sentence patterns memory storage, take out corresponding basic sentence patterns, thereby generate the words and phrases structure of Chinese; Free key element generating apparatus, dependency structure to above-mentioned Chinese, except that fundamental, other statement key element to each, according to case marker knowledge, source language top layer symbol, meaning domination sign indicating number etc., from above-mentioned statement element information memory storage, take out corresponding Chinese beginning of the sentence top layer symbol, sentence tail top layer symbol, the position, space, and and then with reference to the position, space of this taking-up, the correspondence position in sentence structure generates after the free key element, with reference to the property value of each verb, adjectival special sentence pattern, to each sentence structure, generate special sentence pattern; The statement generating apparatus, this device takes out the structure literary composition key element sequence limit in each space in order from aforesaid structure sentence key element sequential storage device, whether step inspection in accordance with regulations is suitable corresponding to the configuration sequence of the space key element of above-mentioned sentence structure, if it is improper, after then it being adjusted, sentence structure is carried out linearization as a statement sequence, generate Chinese statement.
According to a second aspect of the invention, above-mentioned sentence structure generating apparatus has pretreatment unit and fundamental launches the unit, pretreatment unit detects verb and the adjective that has omitted subject to the dependency structure of the Chinese of input, and this sentence structure is added subject attribute zero node; Fundamental expansion unit takes out corresponding basic sentence patterns according to the verb disaggregated classification sign indicating number of the key element in the above-mentioned dependency structure with reference to the basic sentence patterns memory storage, and generates the basic sentence patterns structure of Chinese; Above-mentioned free key element generating apparatus has free key element and launches unit and different sentence pattern generation unit, free key element is launched the unit for above-mentioned dependency structure, except fundamental, also to other key element, press each key element according to case marker will, source language top layer symbol, meaning domination sign indicating number and meaning sign indicating number etc., extract corresponding Chinese sentence pattern top layer symbol from above-mentioned statement element information memory storage, sentence tail top layer symbol and position, space and with reference to these positions, space that extracts, generate free key element in the relevant position of sentence structure, different sentence pattern generation unit is for launching the sentence structure that the unit generates by above-mentioned free key element, with reference to each verb, the property value of adjectival special sentence pattern generates special sentence pattern to each sentence structure; Above-mentioned statement generating apparatus has elements position adjustment unit and post-processing unit, the elements position adjustment unit takes out the sequence limit of each structure sentence key element in turn from structure sentence key element sequential storage device, inspection and adjustment are corresponding to the space key element configuration sequence of above-mentioned sentence structure, post-processing unit is with the sentence structure linearization, and the smooth Chinese statement of acquisition.
According to said structure, in a first aspect of the present invention, the hand of the fabricator by this device, with reference to dictionary etc., each disaggregated classification sign indicating number and the corresponding with it Chinese basic sentence patterns structure with the Chinese verb signs in in the basic sentence patterns memory storage in advance.Sign in in the statement element information memory storage with the case marker knowledge of modifier, source language top layer symbol, meaning domination sign indicating number, meaning sign indicating number and corresponding to their the beginning of the sentence top layer symbol, sentence tail top layer symbol, position, space etc. of Chinese in advance.To sign in in the structure sentence key element sequential storage device corresponding to the sequence limit of the structure of position, space justice key element equally, in advance.The sentence structure generating apparatus is to the dependency structure of the Chinese imported, the dictionary that reference is built-in etc., detect the verb, the adjective that have omitted subject, in corresponding sentence structure, add after subject attribute zero node, verb disaggregated classification sign indicating number according to the key element in the dependency structure, with reference to aforesaid basic sentence patterns memory storage, take out corresponding basic sentence patterns, generate the sentence structure of Chinese.Free key element generating apparatus is to above-mentioned dependency structure, except fundamental, also to other statement key element, arrange sign indicating number, meaning sign indicating number etc. by each statement key element according to case marker will, source language top layer symbol, meaning, take out corresponding Chinese beginning of the sentence top layer symbol, end of the sentence top layer symbol, position, space from aforesaid statement element information memory storage, and and then with reference to the position, space of taking out, corresponding position at sentence structure, generate after the free key element, with reference to the property value of each verb, adjectival special sentence pattern, each sentence structure is generated special sentence pattern.The statement generating apparatus takes out structure sentence key element sequence limit each space in turn from aforementioned structure sentence key element sequential storage device, check the space element arrangements order of above-mentioned sentence structure correspondence, if be necessary, also to replace adjustment, then with sentence structure linearization in addition, just the form of reading according to the people is carried out the correct arrangement of linearity with each word, thereby is obtained Chinese statement.
In a second aspect of the present invention, above-mentioned sentence structure generating apparatus has pretreatment unit and fundamental launches the unit.Pretreatment unit detects the verb, the adjective that have omitted subject to the dependency structure of the Chinese imported, and this sentence structure is added subject attribute zero node.Fundamental launches the unit, according to the verb disaggregated classification sign indicating number of the key element in the dependency structure (verb or adjective), with reference to aforesaid basic sentence patterns memory storage, takes out corresponding basic sentence patterns, generates the basic statement structure of Chinese.Aforementioned free key element generating apparatus has free key element and launches unit and different sentence pattern generation unit.Free key element is launched the interdependent structure of unit to above-mentioned Chinese, except that fundamental, to each key element, according to case marker will, source language top layer symbol, meaning domination sign indicating number, meaning sign indicating number etc., take out corresponding Chinese beginning of the sentence top layer symbol, sentence tail top layer symbol and position, space from above-mentioned statement element information memory storage, and then, generate free key element at the correspondence position of sentence structure with reference to the position, space of having taken out.Different sentence pattern generation unit with reference to the property value of each verb, adjectival special sentence pattern, to each sentence structure, generates special sentence pattern to launching the sentence structure that the unit generated by aforementioned free key element.Aforementioned statement generating apparatus is made of elements position adjustment unit and post-processing unit.The elements position adjustment unit takes out each structure sentence key element sequence limit in turn from structure sentence key element sequential storage device, inspection and adjustment are corresponding to the element arrangements order in the space of above-mentioned sentence structure.Post-processing unit obtains being suitable for the final Chinese statement that people read with the sentence structure linearization.
Below, according to embodiment the present invention is illustrated.
When carrying out the Chinese generation with mechanical translation,, at first, be conceived to the following characteristic of Chinese in order to make necessary structure sentence rule number minimum.
(1) foregoing such, owing to can or influence sentence pattern (thereby also just having determined sentence structure) by the verb decision, so according to the disaggregated classification of verb, decision comprises the basic sentence patterns of this verb.Situation about this respect, for example can be with reference to " Longman English dictionary (Longman Dictionary of Contemporary English; LongmanGroup Limited; 1978 " etc.) the verb classification code (for example tl, di etc.) of dictionary or " people's such as K.J.Chen following paper: A Classification of Chinese verbs forLanguage Parsing ", Procoeding of International Conference ofChinese and Oriental Language "; (P414-417; (Toronto), 1988) etc.
The employed verb disaggregated classification of present embodiment sign indicating number, corresponding basic sentence patterns and example sentence thereof are shown among Fig. 8.The key element of Chinese sentence is verb and adjective.And all adjectives are as verb disaggregated classification sign indicating number (VC, verb classification) given " I1 ".Based on verb, for example in Chinese, the disaggregated classification sign indicating number of " saying " is " I3 " and " T1 ".In view of the above, with reference to Fig. 8, can obtain basic sentence patterns " S+V+CN " and " S+V+O ".For example, " I say that teacher has come " and " I tell stories " are the basic sentence patterns of " saying ".And statement key element S (" I " in the routine literary composition of subject, aforesaid Chinese), V (verb, " the saying " in the previous example literary composition), CN (narrative adjective, " teacher has come " in aforementioned) and O (indirect object, aforesaid " story ") are exactly the fundamental that verb " is said ".That is to say, in any case above-mentioned each key element be also must exist must obligato key element.If do not have above-mentioned certain key element, just the meaning of this statement imperfect (not understanding).For example, " I say " is not a complete Chinese sentence just (S+V).
In addition, in the example sentence of Fig. 8, the meaning of 11 the 2nd example sentence is " he is beautiful " (annotate, " beautiful " is adjective), and the meaning of the example sentence of T2 is that " he has cut enemy's one cutter.", the meaning of the example sentence of D2 is that " he has given toy car to younger brother.”
(2) except the key element of basic sentence patterns, generation position in the statement of other key element, will be according to self lattice (to the verb in the statement, what effect does word have?) sign, the meaning of self and meaning domination sign indicating number be (in interdependent structure, verb of this key element (key element) or adjectival meaning), the one or more positions between above-mentioned necessary key element generate.These key elements are called free key element herein.And, as mentioned above, the position of placing them is called the position, space.For example, adverb of time can be placed on the 1st or the 2nd gap.In the present embodiment, the position that can generate free key element is divided into four spaces shown below.
(1)+and subject+(2)+verb+(3)+object (O, Oi, Od, C, CN)+(4)
Herein, the position in the numeral space of drawing together with bracket in the following formula, the numbering in numeral space, mark in the bracket behind the object represents to form object, the general object of " O " table, and " Oi " shows indirect object, " Od " shows direct object, and " C " shows complement, " CN " table narration complement.
The example of the generation position of free key element is shown among Fig. 9.
(3) except the element of time that is positioned at each space is restricted, the allocation position of each free key element does not have any restriction basically.In addition, as the restriction of element of time, for example, the time layout in the 2nd space must be placed on before the lattice of place.The restriction of the order of the key element in each space of present embodiment is shown among Figure 10.
Secondly, about " meaning of a word ", this is meant the meaning (perhaps being the meaning sign indicating number) of morphology factor self.In embodiment shown below, adopted given meaning category method in the class language dictionary (1985) of river, Japanese angle bookstore publishing.By this meaning category method, to macrotaxonomy (the 1st), middle classification (the 2nd), subclassification (the 3rd), 4 grades of uses of disaggregated classification (the 4th) are classified by the four figures that sexadecimal number constitutes, and show all information of a morphology factor.This class language dictionary is divided into all morphology factors ten macrotaxonomies of " nature ", " proterties ", " change ", " action ", " mood ", " personage ", " property to ", " society ", " learning a craft or trade ", " product thing ", in addition, each macrotaxonomy is divided into classification in ten again.In the present embodiment, before this four figures, add the meaning that S is expressed as follows.
SO (" nature " class)
S02 (" meteorology " that belong to " nature " class)
S028 (" wind " that belongs to " meteorology " class)
S028a (" power " that belong to " wind " class)
The classification code of such level, the meaning scope of high-order meaning sign indicating number is more wider than the low level.That is to say that figure place is low more, the meaning scope of its meaning sign indicating number is narrow more.Therefore, according to actual needs, if utilize the meaning sign indicating number of low level just passable, that just needn't be logined unnecessary low level meaning sign indicating number in advance one by one, thereby can save internal memory.In addition, because this meaning sign indicating number numeral, thereby can carry out digital operation to it, for example can do the contrast (to the matching operation of two word strings) of logic and operation, word string etc., select and during conversion translation etc. according to the meaning category sign indicating number, not only can simply handle (opening flat 3-202954 number) with computing machine, but also can obtain from the more valuable information of meaning sign indicating number generation with reference to preceding disclosed spy is former.In addition, the detailed description of relevant meaning sign indicating number discloses equally owing to opening among the flat 3-202954 the spy, thereby in this omission.
Description of drawings
Fig. 1 is the structural drawing of one embodiment of the present of invention.
Fig. 2 is the action flow chart of the pretreatment unit in the foregoing description.
Fig. 3 is the action flow chart that the fundamental in the foregoing description launches the unit.
Fig. 4 is the action flow chart that the free key element of the foregoing description is launched the unit.
Fig. 5 is the action flow chart of the special sentence pattern generation unit in the foregoing description.
Fig. 6 is the action flow chart of the elements position adjustment unit in the foregoing description.
Fig. 7 is the structural representation of the post-processing unit of the foregoing description.
Fig. 8 is the synoptic diagram of the data structure of storing of the basic sentence patterns memory storage in the foregoing description.
Fig. 9 is the synoptic diagram of the data structure of storing of the statement element information memory storage of the foregoing description.
Figure 10 is the synoptic diagram of the structure sentence key element sequential storage device of the foregoing description data structure of being stored.
Figure 11 is the contents processing of the foregoing description, the first half of a figure who is represented with a concrete example.(annotating: when input, separated)
Figure 12 is the contents processing to the foregoing description, represents with a concrete example, and this is the latter half of presentation graphs.
Figure 13 represents the system chart of the Translation Processing process of the machine translation apparatus in the general intermediate structure mode.
Figure 14 is the system block diagram that example is arranged earlier.
Figure 15 is the Chinese structure sentence rule synoptic diagram that example is arranged earlier.
Figure 16 is for " from top to bottom " being described, the synoptic diagram of " from left to right " this tree processing sequence.
Figure 17 is to be the synoptic diagram that example is come declarative statement structure and dependency structure with the Japanese statement.
Among Fig. 1, each unit piece is described as follows:
100 input blocks
200 pretreatment units
300 fundamentals launch the unit
350 basic sentence patterns memory storages
400 free key elements are launched the unit
450 statement element information memory storages
500 special sentence pattern generation units
600 elements position adjustment units
650 structure sentence key element sequential storage devices
700 post-processing units
800 output units
Embodiment
Fig. 1 is the pie graph of the Chinese generating apparatus used of the mechanical translation of present embodiment.100 input block among this figure.The 200th, pretreatment unit.The 300th, fundamental launches the unit.The 350th, the basic sentence patterns memory storage.The 400th, free key element is launched the unit.The 450th, statement element information memory storage.The 500th, the special sentence pattern generation unit.The 600th, the elements position adjustment unit.The 650th, structure sentence key element sequential storage device.The 700th, post-processing unit.The 800th, output unit.Except that above-mentioned, what have Japanese, Chinese meaning sign indicating number connects into an integral body with reference to saying so with the body of machine translation apparatus with dictionary, translation converter section, various logic operational part, display part, printing portion etc., this is self-explantory, in addition, because with the purport of the present patent application direct relation not, thereby omitted their diagram etc.
Below, be illustrated with regard to the effect of above-mentioned each unit, formation etc.
Dependency structure from input block 100 input Chinese.The Japanese statement that needs are handled carries out the various analyses of Japanese and intermediate structure conversion (Japanese → Chinese) afterwards, for example, the Chinese dependency structure shown in can obtaining in Figure 11 (a), this has just finished input.Among this figure, for example " LEX " that adds on the left side of statement subject " I " represents that this " I " am a morphology factor, and little " S501 " of its underpart is exactly the meaning sign indicating number that had illustrated.Following " VC:T1 " expression of adding of " putting " in addition is exactly T1 as the VC of disaggregated classification sign indicating number." DETERMINATIVE " expression indication synonym." putting " among Figure 11 (a) is key element, and " N " in the square frame of the right represents that it is a noun.200 pairs of pretreatment units have omitted the Chinese dependency structure of subject, add the zero node that has the subject attribute.About its treatment step, the back is elaborated according to Fig. 2 again.Login in advance has as the verb disaggregated classification sign indicating number of search key and corresponding with it Chinese basic sentence patterns in basic sentence patterns memory storage 350, can be used when generating Chinese sentence.Its structure is represented by Fig. 8.According to this figure, for example verb " is laughed at ", belongs to " I2 " of verb classification code, and basic sentence patterns is " S+V ", can confirm " he laughs at " this sentence as example sentence.Fundamental launches unit 300 with reference to the verb disaggregated classification sign indicating number of the key element in the interdependent structure of Chinese (verb, adjective) the basic sentence patterns memory storage 350 as search key, generates the basic statement structure.About its treatment scheme, the back is elaborated according to Fig. 3.Login in advance has the knowledge of modifier case marker, source language top layer symbol, meaning sign indicating number, meaning domination sign indicating number (the meaning sign indicating number of this verb) and the corresponding with it beginning of the sentence top layer symbol as search key in the statement element information memory storage 450, sentence tail top layer symbol, positions, space etc., they are to store with the state that can be used when generating Chinese statement certainly.Its structure as shown in Figure 9.According to this figure, the example sentence of for example enumerating previously as can be seen " he * * yesterday * has a meal at the * of school " " " case marker of word is known is " LOCATION ".In addition, same, it can also be seen that " I have sat from Shanghai to Beijing aircraft " " from " and " to " the case marker knowledge be respectively " STATE_FROM " and " LOC_TO ".And then, can learn that also the meaning sign indicating number is used by meaning domination sign indicating number.Free key element is launched in the unit 400, to the statement key element beyond the fundamental, it as search key, with reference to above-mentioned statement element information memory storage 450, launches free key element with case marker knowledge, meaning sign indicating number and meaning domination sign indicating number, source language top layer symbol etc. in above-mentioned basic statement structure.About its action, the back will be described in detail by Fig. 4.Special sentence pattern generation unit 500 to each sentence structure, generates its special sentence pattern with reference to the attribute of verb, adjectival special sentence pattern respectively.For example, generate " quilt " (bei) sentence, " " (ba) special of sentence etc.About its action, also the Fig. 5 by the back is elaborated.The rule that structure sentence key element sequential storage device 650 is logined in advance relevant for the sequence limit of structure sentence key element.Its structure is shown in Figure 10.Elements position adjustment unit 600 according in the above-mentioned structure sentence key element sequential storage device 650 the restriction of login in advance, adjust putting in order of free key element in the structure sentence structure.700 pairs of sentence structures of post-processing unit add after the some auxiliary element (speech for example continues) and punctuation mark, with this sentence structure linearization and to output unit 800 outputs.Flow process about its action is also described in detail according to Fig. 7 in the back.Output unit 800 has monitor etc.Motion flow with regard to above-mentioned each unit describes.
The motion flow of pretreatment unit 200 at first, is described according to Fig. 2.
(S210 step) imports the dependency structure of Chinese from input block 100.
(S220 step) illustrated by from bottom to top that order was from left to right taken out untreated verb node, adjective node.
(S230 step) judges whether taking-up is successful.If unsuccessful, end process.If success then forwards (S240 step) to.
(S240 step) judges in the adjunctival (modifier) of this node whether have subject.If exist, then return (S220 step).If do not exist, then forward (S250 step) to.
(S250 step) is to adding the zero node that has the subject attribute in this adjunctival.
Below, according to Fig. 3, the motion flow that fundamental is launched unit 300 describes.
(S310 step) accepts Chinese dependency structure from pretreatment unit 300, and the dependency structure that is received is stored in the buffer zone.
(S320 step) presses from top to bottom, and order is from left to right taken out untreated verb node, adjective node from dependency structure.
(S330 step) judges whether taking-up is successful.If unsuccessful, end process.If success forwards (S340 step) to,
(S340 step) takes out the disaggregated classification sign indicating number of the verb of this node.
(S350 step), detects basic sentence patterns from the basic sentence patterns memory storage, and the basic sentence patterns that this detects is stored in the buffer zone as search key with this classification code.Then, forward (S360 step) to.
(S360 step) generates the basic statement structure with reference to the basic sentence patterns of buffer zone stored, and stores with the attribute of connection node.Then, return (S320 step).
Below, illustrate that according to Fig. 4 free key element launches the motion flow of unit 400.
(S410 step) launches unit 300 from fundamental and sends structure literary composition structure.
(S420 step) presses from top to bottom, and order is from left to right taken out untreated sentence structure from structure sentence structure.
(S430 step) judges whether taking-up is successful.If unsuccessful, end process.If success then forwards (S440 step) to.
(S440 step), as search key, the dependency structure with reference to being stored in the buffer zone retrieved this verb, the pairing dependency structure of adjective with the verb in the corresponding sentence structure, adjective.And the dependency structure that will retrieve is stored in the buffer zone.
(S450 step) takes out untreated free key element with reference to the dependency structure of being stored in the above-mentioned buffer zone.
(S460 step) judges whether taking-up is successful.If unsuccessful, change (S465 step).If success then forwards (S470 step) to.
(S465 step) is replaced into the sentence structure in the former processing and handles the sentence structure of ending.Then, forward (S420 step) to.
(S470 step), the case marker of free key element was known and top layer symbol, meaning sign indicating number, the meaning domination sign indicating number of Japanese come retrieve statement element information memory storage 450 as search key, found out the position, space of beginning of the sentence top layer symbol, sentence tail top layer symbol and the generation of Chinese.
(S480 step) generates PP sentence (the preposition phrase of Preposition phrase) with reference to beginning of the sentence top layer symbol, sentence tail top layer symbol and place postposition.After this, forward (S490 step) to.
(S490 step) is generated to above-mentioned PP sentence in the sentence structure with reference to the position, space.Then, return above-mentioned (S450 step).
The motion flow of special sentence pattern generation unit 500 is described according to Fig. 5 below.
(S570 step) launches unit 400 input structure literary composition structures according to free key element.
(S520 step) presses from top to bottom, and order is from left to right taken out untreated sentence structure, forwards (S530 step) then to.
(S530 step) judges whether taking-up is successful.If unsuccessful, end process.If success then forwards (S540 step) to.
(S540 step) according to the structure of Chinese literary composition rule (that is, occur prepositional phrase object after or as other key elements such as adverbial words), judge whether into " " (ba).If " " sentence (ba), then forward (S545 step) to.If not " " sentence, then forward (S550 step) to.
(S545 step) in space 2, generate " " (ba) phrase.
(S550 step) judges whether to be causative sentence or passive sentence.If causative sentence or passive sentence then forward (S555 step) to.If neither the also non-passive sentence of causative sentence then forwards (S560 step) to.
(S555 step) generates " causative sentence " or " passive sentence " in space 2.Then, enter (S560 step).
(S560 step) judges whether to be negative.If negative then forwards (S565 step) to.If assertive sentence then forwards (S570 step) to.
(S565 step) generates as " no " the negative auxiliary word node of " not having " (justice of " not having ") etc. in space 2.
(S570 step) judged and to have or not untreated other key element (for example auxiliary verb etc.).If have, then forward (S575 step) to.If do not have, then return (S520 step).
(S575 step) generates after the node of other key element in the space of correspondence, returns (S520 step).
Secondly, the motion flow of elements position adjustment unit 600 is described according to Fig. 6.
(S610 step) receives structure literary composition structure according to special sentence pattern generation unit 500.
(S620 step) presses from top to bottom, and taking out in turn after the sentence structure that is untreated from left to right forwards (S630 step) to.
(S630 step) judges whether taking-up is successful.If unsuccessful, end process.If success then forwards (S640 step) to.
(S640 step) is stored in untreated sentence structure in the buffer zone.
(S650 step) variable i initialize 1.Then, enter (S660 step).
(S660 step) takes out the whole key elements in i the space with reference to the sentence structure of buffer zone stored, as the ES set (name of a buffer zone.And this English does not have special implication).
(S665 step) judges whether the prime number of wanting in the ES set is 0 or 1.If 0 or 1, then forward (S690 step) to.Otherwise, just forward (S670 step) to.
(S670 step) as search key, with reference to structure sentence key element sequential storage device 650, detects structure literary composition sequence limit with i, and it is gathered (structure literary composition key element sequential memory location as SSLS.In addition, this English does not have special implication).
(S680 step) with reference to the restrictive condition of the SSLS shown in Figure 10, and matching operation to its order of key element and the key element sequence limit in space, is carried out in each space in generated statement.Situation in that the apposition key element exists according to putting in order of restrictive condition, is replaced into the key element order in the space in the processing.
(herein owing to, therefore lift other routine literary composition again and be illustrated than indigestibility.To " he is in the * of school yesterday ' object for appreciation " this sentence, the element arrangements in space 2 is in proper order: " LOC (place)+TIME (time) ".Carry out above-mentioned matching operation, become " TIME, LOC ".Therefore, use first condition " TIME+LOC ", the space key element order of displacement generated statement.Its result becomes " TIME+LOC ".At this moment, generate literary composition and become " he played in school yesterday ".
(S685) elements combination in i space of the sentence structure of buffer zone stored is replaced into ES.After this, enter (S690).
(S690) variable i adds 1.
(S950) judge that whether i is greater than 4.When be false in i>4, return above-mentioned (S660 step).If set up, then return above-mentioned (S620 step).
Secondly, the motion flow of post-processing unit 700 is shown among Fig. 7.
The sentence structure that (S710 step) input is generated by the elements position adjustment unit.
(S720 step) generates other auxiliary element (for example interrogative adverb, statement conjunction etc.)
(S730 step) generates punctuation mark.
(S740 step) is with Chinese statement linearization.From left to right take out Chinese key element as leaf node.Then, enter (S750 step).
The Chinese sentence that (S750 step) will generate is sent to after the output unit 800 end process.
Below, be example in turning over day, the generation action of the Chinese statement in the present embodiment is specified.
Shown in Figure 11 (a) " this The of To て is put い て い Ru among the private は Trucks.(I am put into this this book in the car) " dependency structure of the Chinese that this Japanese sentence is suitable, import by input block 100, deliver to pretreatment unit 200.Then, handle with the illustrated step of Fig. 2 by pretreatment unit 200.In addition, in Figure 11 (a), the modifier that key element " is put " is " I ", " book ", " car ".Its case marker knowledge is respectively " nominative ", " objective case ", " place " lattice.The modification lattice of fundamental " book " are " this ", and its lattice are designated the indication lattice.And then the subject in this dependency structure does not omit, and therefore, can directly deliver to fundamental and launch unit 300.Launch to handle by the step of mistake illustrated in fig. 3 in the unit 300 at fundamental.From the attribute that the verb node of key element " is put ", take out the disaggregated classification sign indicating number VC of verb by the processing shown in the S340 step among Fig. 3.In addition, the VC that is removed is exactly TI.This disaggregated classification sign indicating number as retrieval code, with reference to basic sentence patterns memory storage 350, is obtained the basic sentence patterns " S+V+O " of Chinese.Secondly, with reference to the basic sentence patterns of gained, generate the basic statement structure of Chinese.The basic statement structure of this Chinese statement is shown among Figure 11 (b).And, the noun of the expression of " NP " among the figure herein clause, " VF " shows verbal clause.
Then enter the free key element shown in Fig. 4 and launch the processing of unit 400, launch unit 400, from dependency structure, find out " car " as free key element by free key element.The case marker of the node that this finds is known (LOCATION), and two candidates are arranged as shown in Figure 9.Therefore, arrange sign indicating number (S3830) as search key with the case marker knowledge of free key element and top layer symbol, meaning sign indicating number (S9970) and the meaning of Japanese, with reference to the statement element information memory storage of Fig. 9, the plaid matching sign is carried out matching operation, may candidate thereby detect.The occasion of Fig. 9 " LOCATION " because Japanese top layer symbolic field is a blank column, therefore there is no need to carry out computing.Therefore, as long as meaning domination sign indicating number and meaning sign indicating number are carried out logic and operation.On the basis of operation result, find out optimal candidate.If it is a plurality of that these candidates have, just select the 1st candidate's beginning of the sentence top layer symbol, end of the sentence top layer symbol, the position, space.Its result, obtained Fig. 9 the beginning of the sentence top layer symbol of going up the hurdle most " " and position, space " 4 ".According to above-mentioned,, after this free key element is launched in the position in space 4, become the sentence structure shown in Figure 12 (c) with reference to the sentence structure among Figure 11 (b).
After this, enter the processing of the special sentence generating apparatus 500 of Fig. 5.Because PP clause is arranged after the object in the sentence structure, just must generate " " (ba) sentence.Result by the special sentence pattern processing unit is shown in Figure 12 (d).
And then change the processing of position adjustment unit shown in Figure 6 600 over to.With reference to structure sentence key element sequential storage device 650,, take out the key element sequence limit to each space.Then,, utilize relatively and coupling, adjust putting in order of key element each space.In this example, shown in Figure 12 (d), the front of subject " I " does not have key element, similarly, because verb " is put " and object between do not have key element (real object move to before the verb, after verb, have the empty node of an object), the key element number average in space 1 and the space 2 is respectively zero, the PP key element is arranged before subject and the verb, because also have the PP key element after the empty object, the key element number average in space 1 and the space 2 is respectively 1, thereby there is not necessity of adjustment.
Then, enter the processing of post-processing unit shown in Figure 7 700.As the S720 processing in step, generate demonstrative pronoun " this " and numerative " basis ".Its sentence structure is shown in Figure 12 (c).After this, generate punctuation mark ".", sentence structure carries out linearization, that is from left to right, retrieves the Chinese morphology factor as terminal node.Its result, " I am placed on this this book in the car correct translation." obtained, " I put this this book in car and will not generate." factitious so Chinese statement.At last, translation result is delivered to monitor, output units such as printer 800 are exported.
More than, according to embodiment the present invention has been described, still, the invention is not restricted to the foregoing description.That is to say, in the scope that does not change its purport, also can suitably be out of shape and be implemented.For example:
(1) when making, perhaps opposite with an indispensable structural element of the present invention, from physically, mechanically repeatedly use, a plurality of key elements are formed one, carry out suitable combination.
(2) in conventional device, add necessary hardware and software, make it to have function of the present invention.
(3) source language is non-Japanese language such as English.
(4) Shu Ru language construction is not a dependency structure, but other structure sentence structure, and based on this, generate Chinese statement.
As mentioned above, Chinese forming device for machine translation of the present invention has solved fully Aforementioned existing all problems. Specifically, obtained following effect.
(1) according to the verb disaggregated classification code of Chinese, can determine the basic sentence patterns of Chinese. And then, owing to imported the modes such as fundamental, free key element, position, space, thereby Can reduce the number of the generation structure sentence rule of Chinese statement. Thereby, can be than being easier to Ground carries out maintenance and management to the composition rule of Chinese.
(2) owing to further reduced the system convention number, so, for example, be difficult to The problem of the competition such as a plurality of regular numbers that Shi Fasheng can be suitable for. Therefore, can effectively carry Efficient during the Chinese generation system of high execution.
(3) owing to use simultaneously the meaning of Chinese and the information of structure sentence, thereby can generate more Approach actual, nature, and high-quality Chinese.
Because above all, practical function of the present invention is very big.

Claims (2)

  1. One kind by the statement to source language carry out in advance language analysis, and will do the Chinese that obtains after the intermediate conversion the dependency structure input, and the dependency structure that will import be converted to the Chinese generating apparatus that the mechanical translation of Chinese statement is used, it is characterized in that having basic sentence patterns memory storage, statement element information memory storage, structure sentence key element sequential storage device, sentence structure generating apparatus, free key element generating apparatus and statement generating apparatus, the basic sentence patterns memory storage is logined the disaggregated classification sign indicating number of each verb of Chinese and corresponding with it basic sentence patterns structure in advance; Element informations such as statement element information memory storage logins in advance that the case marker of modifier is known, the top layer symbol of source language, meaning domination sign indicating number, meaning sign indicating number and corresponding with it Chinese beginning of the sentence top layer symbol, sentence tail top layer symbol, position, space; Structure sentence key element sequential storage device is logined the sequence limit of position, space and corresponding with it structure sentence key element in advance; The sentence structure generating apparatus is to the Chinese dependency structure of input, detect the verb, the adjective that have omitted subject, this sentence structure has been added after the empty node of subject attribute, to the key element that forms by verb, adjective in the dependency structure, verb disaggregated classification sign indicating number or adjectival simulation verb classification code according to verb, with reference to aforementioned basic sentence patterns memory storage, take out corresponding basic sentence patterns, generate Chinese sentence structure; Free key element generating apparatus is for above-mentioned dependency structure, to in order to other key element outside the necessary fundamental that constitutes statement, also with reference to above-mentioned statement element information memory storage, to each key element, know according to its case marker, source language top layer symbol, meaning domination sign indicating number, meaning sign indicating numbers etc. take out corresponding Chinese beginning of the sentence top layer symbol, sentence tail top layer symbol, the position, space, and and then with reference to the position, space of this taking-up, after the correspondence position of sentence structure generates free key element, with reference to each verb, the property value of adjectival special sentence pattern generates special sentence pattern to each sentence structure; The statement generating apparatus is from aforesaid structure sentence key element sequential storage device, take out the structure sentence key element sequence limit in each space in turn, inspection and adjustment make the sentence structure linearization then corresponding to the space key element configuration sequence of above-mentioned sentence structure, generate the statement of Chinese at last.
  2. 2. the Chinese generating apparatus that mechanical translation according to claim 1 is used, it is characterized in that: above-mentioned sentence structure generating apparatus has pretreatment unit and fundamental launches the unit, pretreatment unit is to the dependency structure of the Chinese of input, detect the verb and the adjective that have omitted subject, and this sentence structure is added the empty node of subject attribute, fundamental launches the verb disaggregated classification sign indicating number of unit according to the key element in the above-mentioned dependency structure, with reference to aforementioned basic sentence patterns memory storage, take out corresponding basic sentence patterns, generate the basic statement structure of Chinese; Above-mentioned free key element generating apparatus has free key element and launches unit and special sentence pattern generation unit, above-mentioned free key element is launched the unit for above-mentioned dependency structure, to other key element outside the fundamental, also take out corresponding Chinese beginning of the sentence top layer symbol, sentence tail top layer symbol, position, space etc. according to case marker knowledge, source language top layer symbol, meaning domination sign indicating number, meaning sign indicating number from above-mentioned statement element information memory storage by each key element, with reference to the position, space of this taking-up, generate free key element at the correspondence position of sentence structure; The special sentence pattern generation unit with reference to the property value of each verb, adjectival special sentence pattern, generates special sentence pattern to each sentence structure to launching the sentence structure that the unit generates by aforementioned free key element; Above-mentioned statement generating apparatus has elements position adjustment unit and post-processing unit, the elements position adjustment unit takes out each structure sentence key element sequence limit in turn from structure sentence key element sequential storage device, and inspection and adjustment are corresponding to the key element configuration sequence in the space in the above-mentioned sentence structure; Post-processing unit carries out linearization to sentence structure, thereby obtains the statement of Chinese.
CN 96112514 1995-09-11 1996-09-05 Chinese forming device for machine translation Expired - Fee Related CN1120439C (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP7232628A JPH0981568A (en) 1995-09-11 1995-09-11 Chinese language generation device for machine translation
JP232628/95 1995-09-11
JP232628/1995 1995-09-11

Publications (2)

Publication Number Publication Date
CN1156287A CN1156287A (en) 1997-08-06
CN1120439C true CN1120439C (en) 2003-09-03

Family

ID=16942304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 96112514 Expired - Fee Related CN1120439C (en) 1995-09-11 1996-09-05 Chinese forming device for machine translation

Country Status (2)

Country Link
JP (1) JPH0981568A (en)
CN (1) CN1120439C (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1333361C (en) * 2004-06-30 2007-08-22 高庆狮 Method and device for improving accuracy of character and speed recognition and automatic translation system

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2464932A1 (en) * 2001-10-29 2003-05-08 Stephen Clifford Appleby Machine translation
EP1703415A4 (en) * 2003-12-16 2010-03-10 Sharp Kk Device for creating sentence having decoration information
CN101593174A (en) * 2009-03-11 2009-12-02 林勋准 A kind of machine translation method and system
CN101510194B (en) * 2009-03-15 2015-09-09 刘树根 A kind of multilingual professional translation method based on sentence component
CN103314369B (en) * 2010-12-17 2015-08-12 北京交通大学 Machine translation apparatus and method
CN102043849B (en) * 2010-12-20 2015-03-25 惠州市表意软件有限公司 Realization method for electronic dictionary system with ideographic components as elements
JP5800206B2 (en) * 2013-03-01 2015-10-28 日本電信電話株式会社 Word order rearrangement device, translation device, translation model learning device, method, and program
CN105808530B (en) * 2016-03-23 2019-11-08 苏州大学 Interpretation method and device in a kind of statistical machine translation
CN109460552B (en) * 2018-10-29 2023-04-18 朱丽莉 Method and equipment for automatically detecting Chinese language diseases based on rules and corpus
CN109684638B (en) * 2018-12-24 2023-08-11 北京金山安全软件有限公司 Clause method and device, electronic equipment and computer readable storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1333361C (en) * 2004-06-30 2007-08-22 高庆狮 Method and device for improving accuracy of character and speed recognition and automatic translation system

Also Published As

Publication number Publication date
JPH0981568A (en) 1997-03-28
CN1156287A (en) 1997-08-06

Similar Documents

Publication Publication Date Title
CN1205572C (en) Language input architecture for converting one text form to another text form with minimized typographical errors and conversion errors
CN1174332C (en) Method and device for converting expressing mode
CN1608259A (en) Machine translation
CN1158627C (en) Method and apparatus for character recognition
CN1578954A (en) Machine translation
CN1168068C (en) Speech synthesizing system and speech synthesizing method
CN1083952A (en) Authoring and translation system ensemble
CN1387639A (en) Language input user interface
CN1542649A (en) Linguistically informed statistical models of constituent structure for ordering in sentence realization for a natural language generation system
CN1120439C (en) Chinese forming device for machine translation
CN1652107A (en) Language conversion rule preparing device, language conversion device and program recording medium
CN1219266C (en) Method for realizing multi-path dialogue for man-machine Chinese colloguial conversational system
CN1465018A (en) Machine translation mothod
CN1993692A (en) A character display system
CN1328321A (en) Apparatus and method for providing information by speech
CN1648828A (en) System and method for disambiguating phonetic input
CN1770107A (en) Extracting treelet translation pairs
CN1834955A (en) Multilingual translation memory, translation method, and translation program
CN1135060A (en) Language processing apparatus and method
CN101042867A (en) Apparatus, method and computer program product for recognizing speech
CN1942877A (en) Information extraction system
CN1255213A (en) Language analysis system and method
CN86108582A (en) Shorthand translation system
CN1702650A (en) Apparatus and method for translating Japanese into Chinese and computer program product
CN1514387A (en) Sound distinguishing method in speech sound inquiry

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee