MXPA99003732A - Method and apparatus for automated language translation - Google Patents

Method and apparatus for automated language translation

Info

Publication number
MXPA99003732A
MXPA99003732A MXPA/A/1999/003732A MX9903732A MXPA99003732A MX PA99003732 A MXPA99003732 A MX PA99003732A MX 9903732 A MX9903732 A MX 9903732A MX PA99003732 A MXPA99003732 A MX PA99003732A
Authority
MX
Mexico
Prior art keywords
database
nominal
inputs
entries
language
Prior art date
Application number
MXPA/A/1999/003732A
Other languages
Spanish (es)
Inventor
Christy Sam
Original Assignee
Dialect Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dialect Corporation filed Critical Dialect Corporation
Publication of MXPA99003732A publication Critical patent/MXPA99003732A/en

Links

Abstract

Language translation is accomplished by representing natural-language sentences in accordance with a constrained grammar and vocabulary structured to permit direct substitution of linguistic units in one language for corresponding linguistic units in another language. Preferably, the vocabulary is represented in a series of physically or logically distinct databases, each containing entries representing a form class as defined in the grammar. Translation involves direct lookup between the entries of a reference sentence and the corresponding entries in one or more target languages.

Description

+ METHOD AND APPARATUS FOR AUTOMATED LANGUAGE TRANSLATION FIELD OF THE INVENTION The present invention relates in general to automated language translation and in particular to a system for translating restricted linguistic constructions assembled according to an exact grammar. BACKGROUND OF THE INVENTION From the time when improvements in transportation began to significantly reduce the inconvenience and cost of travel across borders, the convenience of universal communication has been recognized. In the 1960s, for example, international efforts were made to promote Esperanto as a universal language. While that effort ultimately failed, the large number of fluent speakers - between 1 and 15 million worldwide - and the scope of the efforts illustrate the importance of the problem. Esperanto was not successful because it required the acquisition of both a new grammar and a new vocabulary, the latter presenting a much greater challenge for the aspirants.
The ease and speed with which information can now be transmitted throughout the world has increased the need for universal communication. Current efforts have focused more heavily on automated translation between the language. Now there are systems in use that generally store in a source language and an objective, millions of words, phrases and combinations frequently used that are based on precision and robustness in the occurrences in the text to be translated. These systems by definition are incomplete, since no system can possibly store every possible combination of words and its usefulness varies with the linguistic idiosyncrasies of its designers and users. It is almost always necessary for a human to verify and modify the resulting translation. These systems also translate one word at a time (and in this way operate slowly) and require a separate database unique to each target language. Furthermore, because they are programmed to recognize distinctive language features and their unique mappings from one language to another, each translation must be done individually. In other words, the time required for multiple translations is the sum of the times for each translation made individually. Translation is difficult for numerous reasons, including the lack of correspondence of one-to-one words between languages, the existence in all language of homonyms and the fact that natural grammars are idiosyncratic; they do not adapt to an exact set of rules that facilitate direct substitution, word-word. It is towards a computational "understanding" of these idiosyncrasies that many research efforts for artificial intelligence have been directed and their limited success testifies to the complexity of the problem. COMPENDIUM OF THE INVENTION The present invention provides an artificial grammar to express the thoughts and information ordinarily conveyed in a natural grammar, but in a structured format susceptible to automated translation. The phrases according to the invention are constructed based on a fixed series of rules that are applied to an organized natural vocabulary. The grammar is clear in the sense of being easily understood by the native speakers of the vocabulary and complex in its ability to express sophisticated concepts, but because the phrases derive from a vocabulary organized according to fixed rules, they can be easily translated from a language to other. Preferably, the vocabulary is represented in a series of different databases physically and logically, each one containing entries that represent a class as defined in the grammar. The translation involves a direct search between the entries of a reference phrase and the corresponding entries in one or more target languages. Unlike the natural languages, the invention employs a finite lexicon (although flexible and extensible), an exact set of form classes and a finite and exact set of rules for sentence formation. Starting with a term of four form classes, sentences can be constructed by iterative application of four expansion rules that regulate the way in which the terms of the various classes can be combined. The resulting "phrases", while those of the natural language it purports to represent, can nevertheless represent in a precise and detailed way the full range of meanings of natural language phrases. The invention exploits the relative ease of learning a new grammar, particularly one that is highly restricted to a few precise rules compared to learning a new vocabulary. As a result, after becoming familiar with this grammar, the user can easily compose sentences in the prescribed way? by the present invention. Accordingly, to use the invention, a phrase in natural language is translated or decomposed into the simplest (typically) grammar of the invention while retaining the original vocabulary. Although it is possible to achieve this with some degree of automation, the full benefits of the invention are more directly achieved by manual development - either by primary translation or direct composition - in the grammar of the invention, which is easily learned and applied. And because translation involves simple substitution of equivalent entries from different languages, translation into multiple languages of a supplied phrase is achieved almost instantaneously. The translated output is as easily understood by a native speaker of objective language as the supply was the author of the original text. In this way, it is possible to carry out "conversations" in the grammar of the invention when formulating statements in accordance with the grammar, passing these to a partner for translation and response and translating the responses of the interlocutor. For example, a business person native to the U.S., and without knowledge of German, can hold a meeting with native German speakers who use as a translation device a laptop computer configured according to the invention, exchanging thoughts on the computer. Without doubt, the same thoughts can be spread simultaneously to multiple interlocutors, each one speaking a different language, with their individual responses translated simultaneously and multiple equally. Correspondents can exchange messages by electronic mail (e-mail) in their native languages, simply by formulating the messages according to the grammar of the invention; the recipients who speak different languages who have electronic mail systems (e-mail) implementing the invention, receive the translated message in their native languages and their responses are automatically translated into the language of the original sender upon arrival; in this way, each correspondent? is exposed only to their native language.
The invention is advantageously used even in situations that demand a final output in a natural language, since translation to this format is easily achieved. For example, a new reporter could file a story phrased in the grammar of the invention for dissemination to numerous offices that serve different national audiences. The story is instantly translated into the appropriate languages upon arrival in the different offices, where it can then be further re-fined in a form suitable for communications to the audience. The skills required if further translation is desired are essentially editorial in nature and thus require less specialized training than would be necessary for example for real-language translation; undoubtedly, the communications media already employ personnel to carry out similar tasks of editing and reviewing raw news material that is taken from cable services. According to the invention, phrases of "linguistic units" are composed, each of which can be one or a few words, of the classes of granted form. These classes are "things" or nominal terms that connote, for example, people, sites, itemes, activities or ideas; "connectors" that specify relationships between two (or more) nominal terms; "descriptors" that modify the state of one or more nominal terms; and "logical connectors" that establish sets of nominal terms. The list of all allowed entries in all four classes represents the global lexicon of the invention. To construct a phrase according to the invention, inputs of the classes are combined according to four expansion rules detailed below. These rules can be followed explicitly in a step-by-step way to produce phrases, but in a more typical way once the user gets used to the grammar, phrases are constructed by "touch or sensation?" and if necessary, they are subsequently tested for compliance with the expansion rules. In this way, the invention solves the three previously noted obstacles that have prevented the emergence of truly robust translation systems. The idiosyncratic nature of different grammars is overcome by replacing a fixed grammar and the problem of one-to-one correspondence is addressed through a specialized and finite database of damages. Homonyms are handled by explicit labeling of the different senses of a word and requiring explicit selection of the intended meaning. These capabilities allow the invention to be conveniently and economically applied to many languages, even exotic ones; Current systems, by contrast, are directed almost exclusively to the main languages due to the expense inherent in their design. An implementation of representative physical equipment includes a series of logically and physically distinct electronic database in which the vocabulary is stored, a separation of computer memory to accept a feed in a reference language and structured according to the invention; and means of analysis (generally a processor operated according to computer stored instructions) to (i) direct the databases with the power to retrieve entries in the corresponding target language and (ii) translate the phrase by replacing the feeding with the target inputs identified. BRIEF DESCRIPTION OF THE DRAWINGS The description of the invention hereinafter refers to the accompanying drawings of which: Figure 1 illustrates schematically the application of the expansion rules of the present invention; and Fig. 2 is a schematic representation of a physical equipment system embodying the invention. DETAILED DESCRIPTION OF AN ILLUSTRATIVE MODALITY The system of the present invention uses a lexicon and a restricted set of grammatical rules. The lexicon comprises linguistic units divided into four classes. Each linguistic unit is (1) a single word, such as "dog" or "government"; or (2) a combination of words, such as "parking space" or "prime minister"; or (3) a proper name; or (4) a word with a unique definition for the invention; or (5) a form of a word with multiple meanings. In this last case, each definition of the word represents a different linguistic unit, the different definitions can appear as entries in different kinds of form. For automation purposes, each definition is distinguished, for example, by the number of periods that appear at the end of the word. The entry for the first definition (designated arbitrarily) is listed in points ?, the entry represents the second definition is listed with a point at the end? and so on. Alternatively, different word senses can be identified numerically, for example using sub-indexes. Unique words for the invention may constitute a very small proportion of the total lexicon, and none of these words is specific to the invention or foreign to the natural language on which it is based. On the contrary, specific words of the invention are amplified in connotation to limit the total number of terms in the lexicon. For example, in a preferred implementation, the word "use" is extended to connote use of any object for its intended primary purpose, such that in the phrase "Jake uses the book" the term connotes reading. The word "in" can be used to connote time (for example (i go to the ball game) yesterday). If desired for ease of use, however, the specific words of the invention can be totally eliminated and the lexicon expanded accordingly. The invention divides the global lexicon of terms allowed into four classes: "things" or nominal terms that connote, for example, people, places, items, activities or ideas, identified here by the T code; "connectors" that specify relationships between two (or more) nominal terms (including words typically described as prepositions and conjunctions and terms that describe relationships in terms of action, being or states of being), identified here by C; "descriptors" modify the state of one or more nominal terms (including words typically described as adjectives, adverbs, and intransitive verbs), identified here by D; and "logical connectors" that establish sets of nominal terms, identified here by C. Preferred lists of nominal terms, connectors, and descriptors are set forth in Appendices 1-3, respectively. The preferred logical connectors are "y" and "o". Naturally, the lexicon can not and does not contain a list of possible names; on the contrary, the proper names, like other words not recognized by the invention, are returned within claudátores to indicate that the translation did not occur. The system also does not recognize verb tenses; the connectors are phrased in the present tense, since time is easily understood from the context. However, the time can be specified by specifying a time, day and / or date.
Phrases according to the invention are constructed from terms in the lexicon according to four expansion rules. The four most basic sentences come from one of the following three constructions (any of which can be created from a term T according to the expansion rules established below). These structures, which represent the smallest possible set of words are considered to convey information are the most complex sentence building blocks. Its structural simplicity facilitates the translation into phrases of natural conversation language; in this way, even complex phrases according to the invention are easily transformed into natural language equivalents through modular analysis of the most basic phrase components (a process facilitated by the preferred representations described later). Basic Structure 1 (BS1) is formed by placing a descriptor after a nominal term to form the TD structure. The BS1 phrases such as "brown dog" and "Bill nothing" are easily translated to the English phrase "the dog is brown" (or the phrase "the brown dog") and "Bill nothing." BS2 is formed by placing a connector between two nominal terms to form the TCT structure. The BS2 phrases such as "dog eats food" are easily translated into English equivalents. BS3 is formed by placing a logical connector between two nominal terms to form a series represented by the TCT structure. . . The series can be a simple conjunction such as "Bob and Ted," or a composite structure such as "Bob and Ted and Al and Jill" or "Red or Blue or Green." A phrase comprising one or more of the basic structures established above can be expanded using the following rules: Rule I: To a nominal term, add a descriptor (T-TD) In accordance with Rule I, any linguistic unit of the nominal class it can be expanded in the original item followed by a new item in the descriptor class, which modifies the original item. For example, "dog" becomes "big dog". Like all the rules of the invention, Rule I is not limited in its application to an isolated nominal term (although this is how the BS1 phrases are formed); on the contrary, it can be applied to any nominal term independently of the location within a larger sentence. In this way, according to Rule I, TDÍ - (D2) D !. For example, "big dog" becomes "big (brown dog)" (which corresponds to the phrase in English, "the brown dog is big"). The order of addition may or may not be important in the case of consecutive adjectives, since they modify independent T; for example in "(big dog) coffee", the adjective "big" distinguishes this dog from other dogs and "coffee" can describe a characteristic that is considered otherwise unknown to the listener or to the audience. The order of addition is almost always important when a term D is an intransitive verb. For example, expand the phrase TD "dog runs" (corresponding to "the dog runs" or "the running dog") by adding the descriptor "fast" form according to Rule I, "(fast dog) runs" ( corresponding to "fast dog runs)". To express "the dog runs fast", it is necessary to expand the phrase TD "fast dog" with the descriptor "runs" in the form "fast dog". Applying Expansion Rule I to structure BS2 produces TCT - (TD) CT. For example, "dog eats food" becomes "(big dog) eats food." Rule I can also be applied to nominal terms composed of the TCT form, such that a structure of the form BS3 becomes TCT - (TCT) D. For example, "mother and father" becomes "(mother and father) lead". In this way, multiple nominal terms can be combined, which it does jointly or alternately for modification purposes. It will also be noted that verbs that have transitive meanings, such as "driving", they are included in the database as connectors as well as descriptors. Another example is the verb "capsize" that can be intransitive ("boat capsize") as well as transitive ("captain capsize the boat"). Rule lia: At a nominal term, add a connector and another nominal term (T - TCT). According to Rule Illa, any linguistic unit of the nominal class can be replaced by a connector surrounded by two nominal inputs, one of which is the original linguistic unit. For example, "house" becomes "house on the hill". Applying the Expansion Rule to BS1 produces TD - (TCT) D; for example, "gloomy house?" becomes "(house on the hill) dreary" or "the house on the hill is gloomy". Rule Ia can be used to add a transitive verb and its object. For example, can the term "mother and father" be expanded? a "(mother and father) travel in the car." Rule Ilb: At a nominal term, add a logical connector and another nominal term (T - TCT). According to Rule 11b, any linguistic unit of the nominal class can be replaced with a connector surrounded by two nominal inputs, one of which is the original linguistic unit. For example, "dog" becomes "dog and cat". Again, for the purposes of Rule llia and the Rule IIb, a nominal term can be a compound consisting of two or more nominal terms joined by a connector. For example, the expansion "(John and Bill) go to the market" satisfies Rule Illa. Subsequent to the application of Rule I, this phrase may be expanded to "((John and Bill) go to the market) together." Rule III: A descriptor is added a logical connector and another descriptor (D - DCD). Rule III, a descriptor can be replaced with a logical connector surrounded by two descriptors, one of which is the original, for example, "large" becomes "large and brown." Rule of expansion III to BS1 produces TD? T (DCD); for example "big dog" (equivalent to "the dog is big", or "the big dog") becomes "dog (big and brown)" (equivalent to "the dog is big and brown" or "the big dog brown") "). The manner in which these rules are applied to form acceptable phrases according to the invention is illustrated in Figure 1. Starting with a nominal term such as cat, illustrated at 110, any of the three basic structures can be formed following the rules of expansion I, lia and Ilb as illustrated at 112, 114, 116, respectively, to produce "stripped cat" (BS1), "cat on sofa" (BS2) or "cat and Sue" (BS3). Iterative application? of the expansion rule in 118 and 119 produces structures of the forms TC ^ - (TCxT ^ CzTz O "((cat on sofa) as mouse)" and - ((TC1T1) C2T2) C3T3 or "(((cat in sofa) as a mouse) with tail) "The Expansion Rule I can be applied at any point to a linguistic unit T as illustrated in 122 (to modify the original T, cat, to produce" (happy cat) on sofa ") and 124 (to modify "come mouse.") Rule III can also be applied as illustrated in 126 (to further modify ?? to produce "(((happy cat?) ?? and cat) on couch") and 128 (to further modify "come mouse.") The Expansion Rule I can be iteratively applied as illustrated at 112, 130 to further modify the original T (although as emphasized at 130, it does not need to be an adjective a? Descriptor). The Expansion Rule is available to show action of the modified T (as illustrated in 132), and Rule I can be used to modify the T re recently introduced (as illustrated in 134). Rule I can also be used to modify (in the broad sense of the invention) a composite subject formed by Rule Ilb, as illustrated in 136. The order in which linguistic units are assembled can strongly affect the meaning. For example, the TCxT expansion? - (TC1T1) C2T2 can take multiple forms. The construction "cat hits (ball on sofa)" conveys a different meaning of "cat hits ball (on sofa)". In the first one, the ball is definitely on the sofa and in the last one the action takes place on the sofa. The phrase "(John wants auto) fast" indicates that the action must be accomplished quickly, while "(John wants (fast car))" means that the car must move quickly. A more elaborate example of the previous expansion rules illustrating the utility of the invention in representing a natural language discussion appears in Appendix 4 present. A representative physical equipment implementation of the invention is illustrated in Figure 2. As indicated therein, the system includes a bidirectional main duct 200, in which all the components of the system communicate. The main sequence of instructions embodying the invention, as well as the databases discussed below, reside in an empty storage device (such as a hard disk or optical storage unit) 202 as well as a main system memory 204 during operation . The execution of these instructions and of carrying out the functions of the invention is achieved by a central processing unit ("UPC (CPU)") 206. The user interacts with the system using a keyboard 210 and a device for detecting position (eg, a mouse) 212. The output of any device can be used to designate information or select particular areas of a display 214 to direct functions to be performed by the system. The main memory 204 contains a group of modules that control the operation of the UPC 206 and its iteration? with other components of physical equipment. An operating system 220 directs the execution of low-level basic system functions such as memory allocation, file management and operation of mass storage devices 202. At a higher level, a 225 analysis module, implemented as a series of stored instructions, directs the execution of the primary functions performed by the invention, as discussed below; and instructions that define a user interface 230 allow an iteration? direct on the display screen 214. The interface 230 generates words or graphic images on the display 214 to signal action by the user and accepts commands from the keyboard user 210 and / or position detection device 212. The main memory 204 also includes a separation defining a series of databases capable of storing the linguistic units of the invention, and denoted representatively by the reference numbers 235-, 235 ?, 2353, 2354. These databases 235, which may be physically distinct (ie, stored in different memory divisions and as separate files on the storage device 202) or logically distinct (ie, stored in a single memory gap as a structured list that can be added as a plurality of databases) each one that contains all the linguistic units that correspond to a particular class at least in two languages. In other words, each database is organized as a table each of whose columns lists all the linguistic units of the particular class as a single language, so that each row contains the same linguistic unit expressed in different languages as the system is able to translate. In the illustrated implementation, nominal terms are contained in the database 235-, and a representative example of the database contents in a single language (English) -that is, the contents of a column in what would be a operational database of multiple columns - appear in Appendix 1 present; connectors are contained in database 2352, an exemplary column of which appears in Appendix 2 present; descriptors are contained in database 2353, an exemplary column of which appears in Appendix 3 annex; and logical connectors (in simplest form, "and" and "or") are contained in the database A feed buffer 240 receives from the user, via the keyboard 210, a feed phrase that is preferably structured according to the invention and formatted as described below. In this case, the analysis module 225 initially examines the feed phrase for compliance with the structure. Following this, module 225 processes simple linguistic units of the feeding phrase in an iterative form, directing the databases to locate the corresponding entries for each linguistic unit in the given language, as well as the corresponding entries in the target language. The analysis module 225 translates the phrase by replacing the inputs fed with the inputs of the target language, providing the translation in an output buffer 245 whose content appears on the display screen 214.
It will be understood that although the main memory modules 204 have been described separately, this is for clarity of presentation only; as long as the system performs all the necessary functions, it is immaterial as they are distributed within the system and its programming architecture. In order to facilitate convenient analysis by module 225, the preferred feeding phrases are structured in a characteristic, easily processed format that facilitates both direct identification of individual linguistic units and simple verification that the sequence of units qualifies as a legitimate phrase according to the expansion rules of the invention. In a focus ("portrait form"), each linguistic unit of a sentence appears on a separate line. If an expansion has been applied, an asterisk (*) is used to mark when the expansion occurred; that is, the * is used to connect basic phrase structures together to form larger sentences. For example, taking from the entries in Figure 1, striped cat * hits * red ball represents the results of stages 132 and 134. Alternatively, the phrase can be expressed in algebraic ("landscape") format where the expansions are identify when wrapping the terms of expansion in parentheses: (striped cat) hits (red ball) In any case, the user's feed is treated as a string of character, and using standard string analysis routines, module 225 identifies the units Separate linguistic and expansion points. Then compare these with templates that correspond to the allowed expansion rules to validate the phrase, after which the search in database and translation is carried out. If the phrase fails to adapt to the rules of the invention, the module 225 alerts the user through the display screen 214. According to any of these rendering formats, the plurals in English are annotated by adding "/ s" at the end. of a singular pronoun (for example, "nation / s"). In other languages, the most generic method of forming plurals is employed; for example, in French, "/ s" is added as in English, but in Italian, "/ i" is added. The numbers are expressed numerically.
Alternatively, the analysis module 225 can be configured to process non-formatted feed phrases. To accomplish this, module 225 searches each word fed (or as appropriate, groups of words) in database 235 and constructs a representation of the phrase in terms of the linguistic classes that comprise it - that is, replacing each unit with its symbol of linguistic class. Module 225 then estimates whether the resulting sequence of classes could have been generated according to the allowed expansion rules and if so, groups the linguistic units to facilitate the search and translation. The output is either provided in an unstructured format corresponding to the feed or in one of the previously established formats. This last form of output is preferred, since word strings in a language rarely correspond in a sensible way to strings of words in another language produced only by substitution; it is usually easier to understand the output in a way that isolates the linguistic units and highlights the expansions. The invention may incorporate additional features to simplify the operation. For example, as noted above, words that have multiple meanings are differentiated by endpoints; naturally the number of points that follow a particular sense of the word represents an arbitrary selection. According to this, an additional database 235 may comprise a dictionary of words having multiple meanings, with the format recognized by the invention of each sense of the word that is established below to the various definitions. The interface 230 interprets the mouse oppression selection of the user in one of the definitions as a selection, and provides the appropriate encoding of the word in the power buffer 240. Similarly, due to considerations of economy and speed limit? of operation, the total convenient size of the databases, one of the databases 235 can be configured as a thesaurus? which gives the linguistic unit recognized by the invention closest to unrecognized feeding word. In operation, when an unsuccessful attempt is made by the analysis module 225 to locate a word in the database, the module 225 can be programmed to consult the thesaurus database 235 and return a list of words that actually appear in the linguistic unit database. Module 225 may also include certain services that recognize and correct (for example, after user approval) errors frequently made in the construction of the phrase. For example, the present invention ordinarily indicates possession by a named person who uses the verb "have"; in this way the phrase "Paul's computer is fast" is represented (in algebraic format) as "Paul has (fast computer) "or" (quick computer) "; if the person is not named, the usual possessive pronouns can be used (for example" fast (my computer) ").
In this way, the module 225 can be configured to recognize constructions such as "de Paul" and the appropriate construction in accordance with the invention returns. Therefore it will be seen that the above represents a convenient and fast approach to translation between multiple languages. The terms and expressions used herein are used as terms of description and not limitation and there is no intention, in the use of those terms, to exclude any equivalents of the characteristics shown and described or portions thereof, but it is recognized that various modifications they are possible within the scope of the claimed invention. For example, the various modules of the invention may be implemented in a general-purpose computer using appropriate software instructions, or as physical equipment circuits or as mixed combinations of hardware-software equipment. APPENDIX 1 APPENDIX 1 page 1 page 1 actor entertainer address address advertisement advertisement advice africa africa afternoon afternoon age age ai goal air air airplane plane airport algeria algeria altitude altitude aluminum aluminum ambassador ambassador amount amount animal animal ankle ankle ans answer ant ant apartment apartment appetite appetite apple apple appointment appointment apricot april april architect architect argentina argentina argument argument arm arm army army arrival arrival art artist art artist asia attic attic august august aunt aunt australia australia austria austria author author authority authority avalanche avalanche APPENDIX 1 APPENDIX 1 page 2 page 2 baby baby back backpack backpack bag bag bag baker baker balcony balcony ball banana banana bandage band bank bench barley barley barn barn barrel barrel basket basket bath bathrobe bathrobe bath tub bathtub battery battery beach beach bean bean bear bear beard beard bed bed bedroom alcove bee bee beef meat beer beer beet beet beginning behavior behavior belgium Belgium bell bell beit belt benefit profit beverage drink bicycle bicycle bill billiard billiard bird bird birth birth birthday birthday bladder blister blanket blanket blood blood blouse blouse boat body body APPENDIX 1 APPENDIX 1 page 3 page 3 Bolivia bolivia bomb bomb bone bone book book border border bottle bottle bottom bottom bowl bowl box boy boy bracelet bracelet brain brain brake brake brass brass brazil Brazil bread bread breakfast breakfast breath breath brick brick bridge bridge broom broom brother brother brush brush construction bulgaria Bulgaria bullet bus bus butcher butcher butter butter butterfly butterfly button button cabbage cab cabin cabana coffee cafe cake camel camel camera camera camp camp glen Canada canal channel candle candle cañe cane capital captain captain captain car automobile cardboard cardboard charge cargo APPENDIX 1 APPENDIX 1 page 4 page 4 carpenter carpet carpenter carpet carrot carrot cash money cat cat cattle cattle cauliflower cauliflower cellar cellar cemetery cemetery chain string chair chair cheek cheek cheese chemistry chemistry chemistry cherry cherry chess chess chest chest chest chicken chicken child chili chili chin chin china china chocolate chocolate christmas christmas church church cigar cigar cigar circle circle citizen citizen clock clothing clothing cloud cloud clove clove club club coal jacket coat jacket cockroach cockroach cocoa cocoa coffee coffee collar collar colombia colombia color color comb comb comfort consolation competition competition computer computer concert concert APPENDIX 1 APPENDIX 1 page 5 page 5 condition condition connection connection conversation conversation cook copper copper copy copy corkscrew corkscrew corn corn cost cost cotton cotton couch bed country country courage value cousin cousin cow cracker cookie crane crane cream cream crib crib crime crime cuba cuba cucumber cucumber cup cup curtain curtain czechoslovakia Czechoslovakia damage harm dance dance danger danger date daughter daughter day day death death debt debt december december decision decision degree denmark denmark dentist dentist departure departure desert desert dessert dessert diarrhea diarrhea dictionary dictionary digestion digestion dining room dinner dinner direction direction APPENDIX 1 APPENDIX 1 page 6 page 6 disease disease dish dish distance document document dog dog donkey ass door door drawing drawing dream dream dress driver driver drum drum duck duck dust dust eagle eagle ear ear earring earring earthquake earthquake ecuador Ecuador education education eel eel egg egypt egypt elbow elbow electricity electricity elevator elevator end end enemy enemy energy energy engine engine engineer england england entrance entrance envelope ethiopia ethiopia europe europe excuse excuse exhibition exhibition exit exit expense expense export export eye eye face face factory factory fall fall family family farm farm APPENDIX 1 APPENDIX 1 page 7 page 7 father father february february ferry ferry fig fig finger finger fingernail finger finland finland fire fire fish, fishing, fist fist flea flea flood flood floor floor flour flour flower flower flute flute fly fly food food foot foot football football forest forest fork fork fox fox france france friday friday friday friend frog frog front front fruit fruit funeral funeral game game garden garden garlic garlic gasoline gas gauge measure germany germany gift gift girl girl glass glass glasses glasses glove glove glue glue goat goat god gold gold goose goose government government APPENDIX 1 APPENDIX 1 page 8 page 8 grape grape grapefruit grapefruit grass lawn greece greece group guard guard guest guide guide gun weapon gymnastics gym hail hail hair hairdresser hairdresser half half hammer hammer hand hand handkerchief kerchief harbour harbor harvest harvest hat hat he he head head health health heart heart heel heel here here highway highway hole hole holiday holland holland honey honey horse horse horse race race cottage hospital hospital hotel hotel hour hour house house hungary Hungary husband husband I yo ice ice ice-cream iceland iceland Iceland idea idea import import india India indonesia Indonesia APPENDIX 1 APPENDIX 1 page 9 page 9 information information ink ink insect insurance insurance interpreter interpreter invention invention Iran Iran Iraq ireland Ireland iron iron island israel Israel Israel It, italy Italy january January japan Japan jewel jewel job job joke joke Jordan Jordan juice juice july july une june kenya Kenya key key kidney kidney kind king king kitchen kitchen knee knee knife knife kuwait knife Kuwait lace cord ladder ladder lake lake lamb lamb language language lawyer lawyer lead lead leaf leaf leather leather lebanon Lebanon leg leg lemon lemon letter letter liberia Liberia library library APPENDIX 1 APPENDIX 1 page 10 page 10 libya libia license license Ufe life light light light-bulb focus lightning lightning lime lime linen linen lion lion lip liquid liquid liver liver living-room living room lobster lobster lock lock look look loom loom love love luck luck luggage luggage lunch lunch lung lung machine machine agazine magazine magic magic maid maid mail mail malaysia malaysia malta Malta man man ap map march march market market marriage marriage match match mattress mattress may power meat meat medicine meeting meeting melon melon member member memorial metal metal mexico Mexico middle middle milk milk APPENDIX 1 APPENDIX 1 page 11 page 11 minute minute mistake error monday monday money money monkey monkey onth month moon moon morning morning morocco morocco mosquito mosquito other mother mountain mountain mouse mouse mouth mouth movie movie mushroo fungus mustard mustard nail nail-file nail file name nature nature neck neck necklace necklace needle neighbor neighbor nepal nepal netherlands netherlands new zealand new zealand newspaper newspaper nicaragua nicaragua nigeria nigeria night night noodle noodle noon noon north america north north pole north pole norway Norway nose nose november november number number nurse nurse nut walnut oak oak oar oats oats october october october office APPENDIX 1 APPENDIX 1 page 12 page 12 oil olive olive onion onion orange orange ore ore ox ox pack package pain pain painting painting pair pair pakistan Pakistan pancake pancake panic panic pants pants paper paper parachute parachute parents parents parking parked part part partridge passport passport passport pea pea peace peace pear pear peasant farmer pen pen pencil pencil people people pepper pepper persia peria peru peru pharmacy pharmacy philippines philippines philippines doctor physician piano piano picture picture pig pig pigeon pigeon pillow pillow pilot pilot pin pin pine-tree pine pipe pipe plant plant platform play play playing card playing card APPENDIX 1 APPENDIX 1 page 13 page 13 pleasure pleasure plum plum pocket pocket poison poison poland Poland police-officer Officer of porter porter Portugal Portugal post-office office postcard postcard postcard pot pot potato papa powder dust prison problem problem property property purse bag quarter queen queen question question rabbit rabbit radio radio rag rag rain rain raincoat waterproof rat rat razor razor receipt receipt record-player turntable refrigerator refrigerator religion religion rent restaurant restaurant result rice ring risk river rocket ring roof rocket rubber rocket ring rocket rubber rope rocket rubber river rocket rubber rope russia russia Russia APPENDIX 1 APPENDIX 1 page 14 page 14 rust rust saddle saddle saddness sadness safety safety saftey-belt safety belt sailor sailor salt salt sand sand saturday saturday saucer sauce saudi-arabia sausage sausage scale scale scarf chew school school science science scissors scissors scotland scotland screw screw sea sea same september september shape shape she she sheep sheep shirt shirt shoe shoe shoulder side side signature signature silk silk silver silver sister sister situation situation size skin skin skis skies sky sky sled scent smell smoke smoke snake snake snow snow snow soap socks socks socks soldier soldier refreshment APPENDIX 1 APPENDIX 1 page 15 page 15 solution solution son son song sound sound soup soup south-africa south africa south-a erica south america south-pole south pole soviet-union soviet union space space spain spain spice spoon spoon spring spring staircase stair stamp stamp star star starch starch station station steak steak steel steel stick stock-market stock market stomach stomach stone store store storm storm story history stove stove street street student subway meter sugar sugar sumer summer sun sun sunday sunday surppse surprise swamp swamp sweden sweden switzerland switzerland syria syria table table tail tail tailor tailor taste flavor tax tax tea tea APPENDIX 1 APPENDIX 1 page 16 page 16 teacher teacher telephone phone television television tent store test test thailand thailand theater theater they they thief thief thigh thigh thing thing thirst thirst thread thread throat throat thumb thumb thunder thunder thursday thursday ticket ticket tie tie tiger tiger time time itinerary tin tin tire tire Toast Tobáceo Tobacco today today toe toilet toilet tomato tomato tomo row morning tongue tool tool tooth tooth toothbrush toothbrush top top towel towel town town toy toy train tree tree trip trip trouble problem truth truth tuesday tuesday tunisia turkey turkey APPENDIX 1 APPENDIX 1 page 17 page 17 tv-show tv program typewriter typewriter umbrella umbrella unite uncle united-states United States of America Uruguay Uruguay us we vaccination vaccination vegetable vegetable velvet velvet Venezuela Venezuela victim victim view view village town vinegar vineyard violin voice oz waiter mozo wall wall war war waste loss watch watch water water we we weather weather wedding wedding wednesday wednesday wednesday wednesday week weight weight wheat wheat where? where? quien? who? wife wife wind wind window winter winter woman woman wood wood wool word word work work year year yesterday yesterday you you yugoslavia Yugoslavia APPENDIX 2 APPENDIX 2 page 1 page 1 APPENDIX 2 APPENDIX 2 CONNECTORS CONNECTORS able-to capable of about on above previously across by scared-of after after against against allow allow answer answer arrest arrive-at arrive-at ask ask at a bake bake be sea because it becomes become before before of begin begin behind behind believe believe bet betray betray betray between blame blame bother bother break break bring bring burn burn but buy buy cali call called capsize overturn capture capture carry carry catch catch cause change change climb rise cyst fence cook cook count account APPENDIX 2 APPENDIX 2 page 2 page 2 cut cut deal-with deal decrease decrease defeat defeat deliver deliver discuss discuss down drink drink drive drop drop eat eat examine examine explain explain find finish finish fix trouble for for forget forget forget from from fry fry give go-in go-in go-through go-through go-to go hang hang hate hate have hear hear help help hit hit hunt hunt if yes in in-front-of in-front-of order-to in-order-to include include increase increase kill death kiss kiss know know learn learn leave leave like as live-in resident look-for seem-to APPENDIX 2 APPENDIX 2 page 3 page 3 made-of-make-make meet meeting mix mix more-than-more-move move near almost need need occupy occupation of on on outside off pay pay play groom groom print print promise promise prove demonstrate puli pull push e punch shoot put read lea reduce reduce refuse scrap remember remember repeat ride ride roast roast say tell see see sell send send sew shave shave shoot shoot should sing sing sing smell smell speak steal steal sting sting stop study study take take teach teach throw throw to APPENDIX 2 APPENDIX 2 page 4 page 4 touch touch transience translate try turn-off turn-on turn under understand understand until use use value value visit visit want need wash wash while win win with with work-for work-for write write APPENDIX 3 APPENDIX 3 page 1 page 1 APPENDIX 3 APPENDIX 3 DESCRIPTORS DESCRIPTORS abroad abroad absent away again agree again agree alive live all all almost almost exclusively also also always always angry an another another any argüe defend artificial artificial automatic available available backward back bad bad bashful embarrassing beautiful begin begin black black blind blind blond blond blue blue boil boil boring boring born born brave brave broken broken brown chestnut burn burn capsize capsize careful careful change change cheap clean clean clear clear cold cold complain complain continue continue correct correct APPENDIX 3 APPENDIX 3 page 2 page 2 Cough Crazy Crazy Cry Cry curious curious damp damp dangerous dangerous dark dark dead dead deaf decrease decrease decrease deep deep detective defective different different difficult difficult dirty dirty drop drown drop drown dry dry early early east east easy easy empty enough enough expensive expensive expire expire extreme extreme far away fast fast fat fat few few first first fíat floor fly fly forbidden forbidden foreign foreigner fragile fragile free free fresh cool fun fun comical glad glad good good goodbye goodbye green green gray gray grow grow guilty guilty APPENDIX 3 APPENDIX 3 page 3 page 3 hang calda happen pass happy happy hard hard healthy healthy heavy strong hungry hungry illegal illegal important important increase increase intelligent intelligent interesting interesting jealous jealous kiss kiss large big last last late laugh laugh lazy lazy left left legal legal long long malignant malignant maybe maybe mean bad more more much much mute mute mutual mutual my my nervous nervous neutral neutral never never new again next soon nice north north not now now often often okay ok old old open our our permitted allowed pink play play APPENDIX 3 APPENDIX 3 page 4 page 4 please please poor poor portable portable possible possible previous before quiet quiet red red rest rest rich rich right right ripe mature round run run sad sad safe safe short short sick sick similar similar sit sit sleep sleep slow slow slowly slowly small smile smile smile soft soft some some sometimes sometimes sour sour south south special special stand position strong very well sweet sweetly swim nade talk talk tall high thanks thanks there there thick thin thin slim think think tired tired together together too-much too transparent transparent travel trip ugly ugly APPENDIX 3 APPENDIX 3 page 5 page 5 upstairs up urgent urgent wait wait walk walk war hot weak weak west west wet wet white white why? why? worry care wrong get lost yellow yellow young young your your APPENDIX 4 page 1 APPENDIX 4 Health officials in Zaire said that 97 people had died of the Ebola virus so far. Jean Tamfun, a virologist who helped identify the virus in 1976, criticized the quarantines and road blockades of the government as ineffective. On Saturday, the quarantine in the Kikwith region officially rose, officer / s? of health of Zaire ^ said * that 97 people * death * due-to * virus of name Ebola Jean-Tamfun is a * virologist in Zaire he helped * scientist / s to identify * virus named Ebola * in 1976 Jean-Tamfun criticized * government of Zaire he said * inefficient quarantine / s * and * inefficient road block / s government ends * quarantine of * region named Kikwit * on Saturday

Claims (20)

  1. CLAIMS 1. A method to translate information from a first language to a second language, the method is characterized because it comprises: a. provide reference sets and nominal database targets, connectors, descriptors and logic-connectors in reference and target languages, each nominal database comprises a series of nominal inputs, each database of the connector comprises a series of inputs of the Each connector that specifies a relationship between at least two nominal terms, each data bae of the descriptor comprises a series of descriptor entries that describe nominal inputs and each logical-connector database comprises a series of inputs that establish sets, the inputs of the database reference set corresponding to the inputs of the database target set; b. generating a phrase in the reference language comprising a plurality of entries of the database reference set; c. direct the target database set with reference entries to retrieve the target entries that correspond to it; and d. Translate the phrase by replacing the reference entries with the target entries. The method according to claim 1, characterized in that the phrase is generated by selecting an item from the nominal database and expanding the phrase by applying at least one of the rules: a. at a nominal input, add a descriptor entry of the descriptor database; b. at a nominal input, add a connector input from the connector database and another nominal input from the nominal database; c. at a nominal input, add a logic connector from the logical connector database, and another nominal input from the nominal database; and d. to a descriptor entry, add one logical connector entry from the logical connector database and another descriptor entry from the descriptor database. 3. The method according to claim 1, characterized in that the nominal entries name a person, site, thing, activity or idea. 4. The method according to claim 3, characterized in that the nominal inputs include the terms set forth in Appendix 1. 5. The method according to claim 1, characterized in that the connector inputs show action, being or state of being. . The method according to claim 5, characterized in that the connector inputs include the terms set forth in Appendix 2. 7. The method according to claim 1, characterized in that the descriptor entries describe a quality, quantity, status or nominal input type. The method according to claim 7, characterized in that the descriptor entries include the entries set forth in Appendix 3. 9. The method according to claim 1, characterized in that the logical connectors database comprises the inputs and , or. 10. Apparatus for translating information from a first language to a second language, the device is characterized because it comprises: a. a first database means comprising a series of nominal inputs in a reference language and at least one objective language; b. second database means comprising a series of connector inputs in a reference language and at least one target language, the connector inputs each specifying a relationship between at least two nominal inputs; c. third database means comprising a series of descriptor entries in a reference language and at least one target language, descriptor entries describe nominal entries; d. media database rooms comprising a series of sets that establish inputs; and. means to accept a feed in the reference language, the feed comprises inputs from the database means; F. means of analysis to (i) direct the target database set with the feed to retrieve target inputs that correspond to them and (ii) translate the phrase by replacing the feed with the target inputs. The apparatus according to claim 10, characterized in that the means of analysis are configured to ensure that the power supply is adapted to a phrase constructed in accordance with the expansion rules comprising: a. at a nominal input, add a descriptor entry from the third database media; b. at a nominal input, add a connector input from the second database means and another nominal input from the first database means; c. at a nominal input, add an entry from the database means quarters and another nominal entry from the first database; and d. to an entry of the descriptor, add an entry from the fourth database and another entry of the descriptor from the third database. 12. The apparatus according to claim 10, characterized in that the nominal entries name a person, site, thing, activity or idea. The apparatus according to claim 12, characterized in that the first database means includes the terms set forth in Appendix 1. 14. The apparatus according to claim 10, characterized in that the connector inputs show action, be or be or state of being.? 15. The apparatus according to claim 14, characterized in that the second database means includes the terms set forth in Appendix 2. 16. The apparatus according to claim 10, characterized in that the descriptor entries describe a quality, quantity, state or type of a nominal entry. The apparatus according to claim 16, characterized in that the third database means include the terms set forth in Appendix 3. 18. The apparatus according to claim 10, characterized in that the fourth database means comprise the entries and, or. 19. The apparatus according to claim 10, characterized in that they also comprise a thesaurus to identify, for a supply term that does not correspond to any of the database entries, a database entry closer in meaning to the term fed. 20. The apparatus according to claim 10, characterized in that it also comprises means for displaying the translation.
MXPA/A/1999/003732A 1996-10-31 1999-04-22 Method and apparatus for automated language translation MXPA99003732A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US08740654 1996-10-31

Publications (1)

Publication Number Publication Date
MXPA99003732A true MXPA99003732A (en) 2000-04-24

Family

ID=

Similar Documents

Publication Publication Date Title
US5884247A (en) Method and apparatus for automated language translation
US7069265B2 (en) Information coding and retrieval system and method thereof
Price A comprehensive French grammar
Muzale et al. Researching and documenting the languages of Tanzania
Yorkston et al. A comparison of standard and user vocabulary lists
US20060195433A1 (en) Information searching system and method thereof
Chen et al. Topical clustering of MRD senses based on information retrieval techniques
de Hernandez et al. African women writing resistance: an anthology of contemporary voices
Vanderwende The analysis of noun sequences using semantic information extracted from on-line dictionaries
Igboanusi The Igbo tradition in the Nigerian novel
LIST From 2020
MXPA99003732A (en) Method and apparatus for automated language translation
Lundy Text Mining Contemporary Popular Fiction: Natural Language Processing-Derived Themes Across Over 1,000 New York Times Bestsellers and Genre Fiction Novels
Aikhenvald I saw the dog: How language works
Butler 8 Language, literature, culture and their meeting place in the dictionary
Berlinski Fieldwork: A Novel
Rodríguez Lexicographic tools. A course book (2ª edición corregida y aumentada)
Sheypak Lingual, sociolingual and translation parameters of new English vocabulary and phraseology
Brooke Survive! Starter Level Oxford Bookworms Library
Осиянова et al. English Lexicology
Gritzmacher Unburying Water/What We [Un] Bury: a nonbinary settler’s search for a name and the missing
Greene The poetry school of experience
Дроздова Elementary Vocabulary+ Grammar
MUNYAYA Sense relations and Lexical pragmatic processes in linguistic Semantics: a description of the Kigiryama system of meaning
O'Dell et al. Test Your English Vocabulary in Use Upper-intermediate Book with Answers