GB2096374A - Translating devices - Google Patents
Translating devices Download PDFInfo
- Publication number
- GB2096374A GB2096374A GB8110483A GB8110483A GB2096374A GB 2096374 A GB2096374 A GB 2096374A GB 8110483 A GB8110483 A GB 8110483A GB 8110483 A GB8110483 A GB 8110483A GB 2096374 A GB2096374 A GB 2096374A
- Authority
- GB
- United Kingdom
- Prior art keywords
- language
- source
- module
- target
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/55—Rule-based translation
Abstract
A translating device translates one natural human language into another via an intermediate language, to reduce the number of dictionaries required when there are a number of possible source and target languages. <IMAGE>
Description
SPECIFICATION
Translating devices
This invention concerns translating devices, and relates in particular to apparatus for the automatic translation of one language into another.
With the advent of the microprocessor, coupled with the ability to store large amounts of data in a relatively small space, it has become possible to build small, fast, computer-type machines capable of translating information (such as speech or text) in one human language into another. There are presently commercially available a number of such machines; the larger are capable of full and fairly idiomatic translation of many thousands of words and expressions both on their own and in the form of whole sentences, while the smaller ones, though usually having a rather limited ability, are small enough to be held in the hand.
Though doubtless their specific modes of operation are different, these machines seem to have their general principles in common. Briefly, each machine contains in some form or other a dictionary for the two languages (for two-way translation of, say, French and German the machine needs both a French-into-German and a
German-into-French dictionary), and two sets of rules relating to the syntax of the two languages concerned. in operation the machine accepts as input a word or string of words in one language (the source language), uses the syntax rules for that language to decide what sort of word it is/they are, which enables it to use the dictionary properly to ascertain the correct translation into the second language (the target language), and finally uses the syntax rules for the target language to construct a word or string of words output in that language.
This type of system suffers from at least one serious shortcoming: there have to be two dictionaries for every possible source/target pair of languages. Thus: for two languages (French and German, say) there have to be two dictionaries (French-to-German and German-to
French); for three languages (French, German and
Spanish, say) there have to be six dictionaries (French-to-German, German-to-French; French-to
Spanish, Spanish-to-French; German-to-Spanish,
Spanish-to-German); for four languages there have to be twelve dictionaries; and ten languages need ninety dictionaries. Indeed, for any number
N of languages there are N !/(N-2) i=N2-N pairs of languages, requiring the same number of dictionaries. As is clearly evident, the cost and complexity of a system adapted to translate any two of even as few as ten languages is extremely high.The present invention seeks to reduce the problem, and the cost, down to manageable proportions by the application of a concept that seems to be quite new in the machine translation area, namely the translation in a first stage of each source language into a common intermediate language, followed by the translation from that intermediate language into the or each target language; using this concept, while two languages now need four dictionaries, and while three still need six, four languages only need eight (instead of twelve), and ten languages only need twenty (instead of ninety). Indeed, for any N languages there are only 2N pairs with the intermediate language (N into it, N out of it), requiring merely the same number-2N-of dictionaries.Clearly, the application of this intermediate language concept will very considerably reduce the cost and complexity of machine translation.
In one aspect, therefore, this invention provides a translation device which includes a source language module, into which there is input the source language to be translated from and in which that source language is translated into an intermediate language which is thereafter output from the source module, in association with a target language module, into which there is input the intermediate language output of the source module and in which the intermediate language is translated into the target language which is thereafter output from the target module.
The device of the invention includes a source language module and a target language module, and (as discussed hereinafter) in use these two will form part of a complete translation system including input/output means and so on. The two modules are associated together both in the sense that the output of the former is the input of the latter and in the sense that each makes use of the same intermediate language. Moreover, it will generally be the case that the two modules are physically associated (being adjacent parts of the whole translation system).It is here convenient to point out that, since it is a prime purpose of the invention to reduce the complexity and cost of translating machines, while each module may be no more than a portion of and integral with the whole machine, nevertheless it is very much preferred that each module in fact be a discrete physical entity, readily physically replaceable by some other module of the same sort (source or target, as appropriate). Indeed, it is preferred that each module be of a "plug-in" type so that it can be so physically replaced with the minimum of difficulty. Thus, if it is required to translate from the chosen source language not into the first chosen target language but into a second, then the first target language module may simply be removed and replaced by the second target language module, no other conceptual change being necessary.Similarly, if it is required to translate not from the first chosen source language but from a second, then the first source language module may simply be removed and replaced by the second source module, again without any other conceptual change being necessary.
It is the purpose of the device of the invention to translate from one language (source) into another (target) via an intermediate language. By source and target language is generally meant a natural human language (tongue) as exemplified by, say, French, German or Spanish, though it is not presently intended to exclude artificial languages, nor is it presently intended to exclude languages which are not human.
By intermediate language is generally meant much the same-a natural human language-but also not excluding artificial and/or non-human languages. However, in the case of the intermediate language there is very preferably applied what may be regarded as the second main feature of the invention, namely that in order to reduce the problems of translating into and out of the intermediate language this latter should as near as possible be a perfect language, the term "perfect language" here meaning a language that has no, or very few, irregularities of any sort.
Accordingly, since, by the very nature of their origins and development, few (if any) natural human languages are anything like perfect, the intermediate language is very preferably an aritificial language, and may be either one of those artificial human languages known now or in the future-most of these (such as Volapuk, Ido, lnterlingua, Novial and Esperanto, which latter is presently preferred) are less "imperfect" than any available popular natural human language-or a language especially designed for this purpose.
It should be noted, incidentally, that the source and target languages do not necessarily need to be different. Instead, the translating device of the invention may be used to change-to "translate"-for example the tense (say, from past to future) or the person (from singular to plural). Moreover, it should also be noted that a single translation machine could easily include more than one target language module at once, so providing simultaneous translation of a source language into several target languages.
The translation device of the invention will conveniently include appropriate input/output means, and these may be any considered suitable-for example, as input means keyboards, light pens, punched or magnetic tapes, vocoders and character readers, and as output means printers, tapes, VDUs and speech synthesizers.
The translation device of the invention is presently best realised in terms of an electronic device including the required data stores and one or more data-handling microprocessors, and the overall architecture of such a device is not too dissimilar to that of two of the presently-available translating machines in tandem. Broadly, then, the device as a whole comprises two recognisable "halves". The input half comprises: input means; a digitiser; a source language input buffer; a source language sentence and syntax analyser; a source language word analyser; a source-tointermediate language dictionary; a general store and control unit; a source-to-intermediate language syntax changer; and an intermediate language output buffer.The output half comprises: an intermediate language input buffer; an intermediate-to-target language syntax changer; and intermediate-to-target language dictionary; a general store and control unit; and output means. All these sections may briefly be explained as follows (where "SL", "IL" and "TL" are Source Language, Intermediate Language and
Target Language respectively):~
For the input half
The input means may, as discussed hereinbefore, be any suitable, and at present the most likely input means is a keyboard.
The digitiser is required because it is usually most convenient to handle all input in digital, specifically binary digital form, both before during and after each translation (prior to the final output).
The SL input buffer holds part of the digitised text that is to be translated. It is only required to store a portion of the input text (for example, a paragraph or a single sentence), the subsequent processing being achieved portion by portion (thus, paragraph by paragraph, or sentence by sentence) rather than as a continuous flow mechanism. Naturally, the operation of the input means is controlled in corresponding fashion; while it is possible to run the input continuously, this makes the whole system operate more slowly (as the overall rate then needs to be set according to the most time-consuming section of text to be translated that is envisaged) unless a very large buffer is used as a "smoothing" system.
Under the overall control of the general store and control circuit, the SL sentence and syntax analyser analyses the text portion held in th,e SL input buffer in conjunction with the SL word analyser which itself is operated in conjunction with the SL-to-lL dictionary. The sentence and syntax analyser and the word analyser together perform operations such as sorting out and associating coupled words (for example, auxiliary verbs and their main verbs), and also relate adjectives and adverbs with their associated words.
The output of these analytical units is in the chosen intermediate language; it is passed to the
SL-to-lL syntax changer where it is converted to standard syntactic form, then stored in the IL output buffer, where it is ready to be coupled to the output half of the translating machine.
As intimated, the SL general store and control unit oversees the operation of all the sections.
For the output half
The IL input buffer interfaces with the IL output buffer of the input half, and accepts therefrom the stored intermediate language text.
The lL-to-TL syntax changer re-orders the intermediate language text into the syntax appropriate to the required target language (the rules of syntax, and the positioning of auxiliary verbs, adjectives, etc., are held in the TL general store and control unit).
In conjunction with the IL-to-TL dictionary, the
TL general store then produces the required target language output.
Finally, the digitised target language is passed
to the output means (a printer or VDU, for
example) where it is converted back into a more
suitable, human-readable form.
Although the above-described electronic
device of the invention may be so constructed that
each language module contains all the
components required, being complete in itself, it
is more conveniently arranged that, while the
device as a whole includes all the necessary
components, the two modules include only those
that are specific to the particular languages
concerned. Thus, for the input half of the device
the input means (except when, for example, a
peculiar alphabet is involved), digitiser, SL input
buffer and IL output buffer are common to all
languages, and any one source language module
itself need only contain the SL sentence and
syntax analyser, SL word analyser, SL-to-lL
dictionary, SL-to-lL syntax changer and SL
general store and control unit.Similarly, for the
output half of the device the IL input buffer and
the output means (except when, for example, the
latter requires a special alphabet) are common to all languages, and any one target language
module itself need only contain the IL-to-TL
syntax changer, IL-to-TL dictionary and TL general
store and control unit.
The invention extends, of course, to a
translating machine whenever using a modular
translating device as described and claimed
herein.
One embodiment of the invention is now
described, though only by way of illustration, with
reference to the accompanying drawings in
which:
the Figure shows in block diagram form a
translating machine utilising the device of the
invention.
The Figure itself needs no description.
However, the mode of operation of the depicted
device does, and may be described as follows.~ In the design and manufacture of the two
portions of the device a certain amount of
software in the form of computer programming is
necessary. For the purposes of this description it
is assumed that this preparatory work has been
done.
Input of source language
1. When the SL input buffer is vacant or nearly
so the SL General Store and Control Unit (GSCU)
starts, or restarts, the SL input means so as to
initiate the passing of SL text into the digitiser
where it is converted into a convenient digital
machine code. From the digitiser the thus
digitised SL text is passed into the SL input buffer.
When the SL input buffer is full, the SL GSCU
stops the SL input means passing more of the SL
text.
2. The first SL text sentence in the SL input
buffer contents is passed to the SL
sentence/syntax analyser, the individual words
being passed to the SL word analyser. In the latter
the words are analysed into morphemes (which
are the smallest syntactic unit). Each set of morphemes constituting a word is analysed as a word, and the grammatical characteristics of the word noted against the word. This analysis is made in conjunction with the SL-to-lL Dictionary.
For example, the French: vous ne parlez pas doucement' would be divided into 'vous/ne/parl/ez/pas/dou/ce/ment' 'vous' would be found as a personal pronoun second person plural.
'ne' would be noted as part of a word or phrase.
'parl' would be recognised as a morpheme associated with speaking and could have an associated significance scale number to distinguish it from-say-causer (to chat) or crier (to shout).
'ez' would be recognised as a verb ending (second person, plural, present indicative).
'pas' would be recognised as a 'step' or as a morpheme used with a number of other words, such as 'ne'.
The SL sentence and word analyser would examine the sentence for word associations, e.g.
the 'ne' and 'pas' in the previously-used example of a sentence (but see further below).
3. In the SL-to-lL language dictionary it can be written into the initial software that some words have a "significance" scale associated with them.
For example, against the word 'doucement' (in the most general terms, "slowly") there can be a scaling number of, say, 3 to fit with a scale of, say, twelve degrees between the words associated with the slowest and fastest type of action, together with another number scaled for the degree of agitation from smooth calm to highly excited (this latter can be on a scale of, say, 10, and the word rated as 4).
Words that may be defined without ambiguity can be entered into the SL-to-lL dictionary by the use of some standard internationally-useable source, for example the Oxford English Dictionary.
The words can be codified for the intermediate language (e.g. volume number, page number, column number, word number, meaning number); the coding is required to allow for the insertion of new words, which can be achieved as an addenda with an ever-growing content.
The one standard word dictionary (e.g., O.E.D.) would be in common use as a source of definitions of all words in all the used target languages-thus, for 'horse', a word which appears to have a few common morphemes in common languages.
The SL-to-lL dictionary contains both morphemes and the grammatical characteristics of words as sets of morphemes.
4. The words (and any associated words, as "ne" and "pas") in the sentence are analysed syntactically by the SL sentence analyser. The sentence is analysed, using the language convention appropriate to the SL, by studying the positions of parts of speech and by the implied association of words that is detectable by grammatical agreement (i.e., gender, number and person). From this the form of the sentence may be detected, be it declaratory, interogative, imperative or exclamatory, and for this, and the sequence of words that are associated with the sentence form, the sentence may be analysed into subject and predicate and (foliowing the grammatical rules) into phrases and clauses.
5. The IL words produced by the SL-to-lL dictionary are now re-arranged in a formal manner to construct the intermediate language.
The SL sentence then exists as a string of lexical units taken from the IL dictionary together with their grammatical characteristics and with an expression that characterises the form of construction of the SL sentence. This IL 'text' is now passed to the output buffer.
Output of target language
1. The IL input buffer accepts (under the control of the TL GSCU) the IL sentences with the codings from the SL module output buffer.
2. The lL-to-TL syntax changer then re-orders the words, phrases, and clauses according to the standard form and requirements of the TL grammar, the rules of which are stored in the TL
GSCU.
3. The lL-to-TL dictionary then translates on a word by word basis from the intermediate language to the target language.
4. The grammar of the target language is then corrected to fit the grammatical rules concerning the peculiarities of the target language.
For example:
In English: I do not speak Spanish;
In French: Je ne pane pas Espahol; In Spanish: No hablo Espa ol.
In the Spanish the personal pronoun 'Yo' is usually elided, except when emphasis is required.
5. Finally, with the syntax and words changed to suit the target language, the target language text is passed in digital form to the TL output, where it is reconstituted into a form more easily read by humans.
Claims (1)
- Claims1. A translation device which includes a source language module, into which there is input the source language to be translated from and in which that source language is translated into an intermediate language which is thereafter output from the source module, in association with a target language module, into which there is input the intermediate language output of the source module and in which the intermediate language is translated into the target language which is thereafter output from the target module.2. A translation device as claimed in claim 1, wherein the source language module and the target language module are each a discrete physical entity, readily physically replaceable by some other module of the same sort (souce or target, as appropriate).3. A translation device as claimed in either of the preceding claims, wherein the intermediate language is an artificial language, less "imperfect" than any available popular natural human language.4. A translation device as claimed in any of the preceding claims, wherein there is more than one target language module at once, so enabling simultaneous translation of a source language into several target languages.5. A translation device of the invention as claimed in any of the preceding claims, which comprises:~ a) an input half itself comprising: input means; a digitiser; a source language input buffer; a' source language sentence and syntax analyser; a source language word analyser; a source-tointermediate language dictionary; a general store and control unit; a source-to-intermediate language syntax changer; and an intermediate language output buffer; and b) an output half itself comprising: an intermediate language input buffer; and intermediate-to-target language syntax changer; an intermediate-to-target language dictionary; a general store and control unit; and output means.6. A translation device as claimed in claim 5, wherein the source language module itself only contains the SL sentence and syntax analyser, SL word analyser, SL-to-lL dictionary, SL-to-lL syntax changer and SL general store and control unit, while the target language module itself only contains the IL-to-TL syntax changer, IL-to-TL dictionary and TL general store and control unit.7. A translating device as claimed in any of the preceding claims and substantially as described hereinbefore.8. A translating machine whenever using a translating device as claimed in any of the preceding claims.New Claims or Amendments to Claims filed on 3 Dec 1981.Superseded Claim 1.New or Amended Claims:1. A human language translation device which includes a source language module, into which there is input the source human language to be translated from and in which that source language is translated into an intermediate language which is thereafter output from the source module, in association with a target language module, into which there is input the intermediate language output of the source module and in which the intermediate language is translated into the target human language which is thereafter output from the target module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB8110483A GB2096374B (en) | 1981-04-03 | 1981-04-03 | Translating devices |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB8110483A GB2096374B (en) | 1981-04-03 | 1981-04-03 | Translating devices |
Publications (2)
Publication Number | Publication Date |
---|---|
GB2096374A true GB2096374A (en) | 1982-10-13 |
GB2096374B GB2096374B (en) | 1984-05-10 |
Family
ID=10520895
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB8110483A Expired GB2096374B (en) | 1981-04-03 | 1981-04-03 | Translating devices |
Country Status (1)
Country | Link |
---|---|
GB (1) | GB2096374B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0300025A1 (en) * | 1987-02-05 | 1989-01-25 | TOLIN, Bruce G. | Method of using a created international language as an intermediate pathway in translation between two national languages |
EP0387226A1 (en) * | 1989-03-06 | 1990-09-12 | International Business Machines Corporation | Natural language analyzing apparatus and method |
EP0486017A2 (en) * | 1990-11-15 | 1992-05-20 | Canon Kabushiki Kaisha | Method and apparatus for further translating result of translation |
US5237502A (en) * | 1990-09-04 | 1993-08-17 | International Business Machines Corporation | Method and apparatus for paraphrasing information contained in logical forms |
EP0590332A1 (en) * | 1992-09-28 | 1994-04-06 | Siemens Aktiengesellschaft | Method for realising an international language bond in an international communication network |
FR2713800A1 (en) * | 1993-12-15 | 1995-06-16 | Gachot Jean | Method and apparatus for transforming a first voice message into a first language, into a second voice message spoken in a second predetermined language. |
EP1464006A1 (en) * | 2001-12-21 | 2004-10-06 | Eli Abir | Multilingual database creation system and method |
EP1464007A1 (en) * | 2001-12-21 | 2004-10-06 | Eli Abir | Multilingual database creation system and method |
-
1981
- 1981-04-03 GB GB8110483A patent/GB2096374B/en not_active Expired
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0300025A4 (en) * | 1987-02-05 | 1990-12-27 | Bruce G. Tolin | Method of using a created international language as an intermediate pathway in translation between two national languages |
EP0300025A1 (en) * | 1987-02-05 | 1989-01-25 | TOLIN, Bruce G. | Method of using a created international language as an intermediate pathway in translation between two national languages |
EP0387226A1 (en) * | 1989-03-06 | 1990-09-12 | International Business Machines Corporation | Natural language analyzing apparatus and method |
US5386556A (en) * | 1989-03-06 | 1995-01-31 | International Business Machines Corporation | Natural language analyzing apparatus and method |
US5237502A (en) * | 1990-09-04 | 1993-08-17 | International Business Machines Corporation | Method and apparatus for paraphrasing information contained in logical forms |
US5541837A (en) * | 1990-11-15 | 1996-07-30 | Canon Kabushiki Kaisha | Method and apparatus for further translating result of translation |
EP0486017A2 (en) * | 1990-11-15 | 1992-05-20 | Canon Kabushiki Kaisha | Method and apparatus for further translating result of translation |
EP0486017A3 (en) * | 1990-11-15 | 1993-06-30 | Canon Kabushiki Kaisha | Method and apparatus for further translating result of translation |
EP0590332A1 (en) * | 1992-09-28 | 1994-04-06 | Siemens Aktiengesellschaft | Method for realising an international language bond in an international communication network |
WO1995016968A1 (en) * | 1993-12-15 | 1995-06-22 | Jean Gachot | Method and device for converting a first voice message in a first language into a second message in a predetermined second language |
FR2713800A1 (en) * | 1993-12-15 | 1995-06-16 | Gachot Jean | Method and apparatus for transforming a first voice message into a first language, into a second voice message spoken in a second predetermined language. |
EP1464006A1 (en) * | 2001-12-21 | 2004-10-06 | Eli Abir | Multilingual database creation system and method |
EP1464007A1 (en) * | 2001-12-21 | 2004-10-06 | Eli Abir | Multilingual database creation system and method |
EP1464006A4 (en) * | 2001-12-21 | 2006-05-24 | Eli Abir | Multilingual database creation system and method |
EP1464007A4 (en) * | 2001-12-21 | 2006-05-24 | Eli Abir | Multilingual database creation system and method |
Also Published As
Publication number | Publication date |
---|---|
GB2096374B (en) | 1984-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tiedemann | Recycling translations: Extraction of lexical data from parallel corpora and their application in natural language processing | |
US5220503A (en) | Translation system | |
Hajič | Complex corpus annotation: The Prague dependency treebank | |
Aqlan et al. | Arabic–Chinese neural machine translation: Romanized Arabic as subword unit for Arabic-sourced translation | |
Maučec et al. | Slavic languages in phrase-based statistical machine translation: a survey | |
Mengliyev et al. | The morphological analysis and synthesis of word forms in the linguistic analyzer | |
JP2815714B2 (en) | Translation equipment | |
Ortega et al. | Using morphemes from agglutinative languages like Quechua and Finnish to aid in low-resource translation | |
GB2096374A (en) | Translating devices | |
Hans et al. | Improving the performance of neural machine translation involving morphologically rich languages | |
Hettige et al. | A multi-agent solution for managing complexity in english to sinhala machine translation,” | |
Akeel et al. | ANN and rule based method for english to arabic machine translation. | |
Kituku et al. | Towards Kikamba computational grammar | |
Bladier et al. | RRGparbank: A parallel role and reference grammar treebank | |
Iftene et al. | Named entity recognition for Romanian | |
US11216617B2 (en) | Methods, computer readable media, and systems for machine translation between Arabic and Arabic sign language | |
JP7247460B2 (en) | Correspondence Generating Program, Correspondence Generating Device, Correspondence Generating Method, and Translation Program | |
Tolegen et al. | A finite state transducer based morphological analyzer for the kazakh language | |
Fenogenova et al. | Automatic morphological analysis on the material of Russian social media texts | |
Winge | Automatic annotation of Latin vowel length | |
JPS6190269A (en) | Translation system | |
Myint et al. | Chunk Tagged Corpus Creation for Myanmar Language | |
Beneš | Processing of translations between languages: software methods, artificial intelligence and their advantages and disadvantages | |
Winiwarter | Incremental learning of transfer rules for customized machine translation | |
Girma | Bi-directional Amharic–Afaan Oromo Machine Translation Using Statistical Approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PCNP | Patent ceased through non-payment of renewal fee |