DE3733674A1

DE3733674A1 - Speech analyser

Info

Publication number: DE3733674A1
Application number: DE19873733674
Authority: DE
Inventors: Toshihiko Yokogawa
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1986-10-03
Filing date: 1987-10-05
Publication date: 1988-04-21
Also published as: NL8702359A; FR2604814B1; FR2604814A1; DE3733674C2

Abstract

A dictionary analyser has a dictionary device with dictionary data stored in it, including morpheme data for words, composed words and phrases and an analysing device for carrying out a morphological analysis for an entered sentence, making reference to the dictionary device, the dictionary device containing data for the degree of coupling, which indicates the degree of coupling between the respective words which constitute the composed words or phrases, the analysing device referring to the dictionary device for the respective words which are contained in the entered sentence, and if a number of dictionary data are called up for a word in combination with other words, the combination of words having a higher degree of coupling is selected by referring to the data for the degree of coupling.

Description

Die Erfindung betrifft einen Sprachanalysator und betrifft insbesondere einen Sprachanalysator zum Analysieren von natürlichen Sprachen, wobei der Analysator in Verbindung mit automatischen Übersetzungseinrichtungen verwendet werden soll.The invention relates to a speech analyzer and relates to in particular a speech analyzer for analyzing natural ones Languages, the analyzer in conjunction with automatic translation devices can be used should.

Bei den herkömmlichen Sprachanalysatoren ergeben sich verschiedene, nachstehend aufgeführte Schwierigkeiten. Beim Analysieren von Morphemen in einem Satz, beispielsweise von Worten usw., ist es wichtig und hat einen erheblichen Einfluß auf das Ergebnis der Analyse, zu beurteilen, ob ein bestimmtes Wort allein verwendet wird, oder ob es als ein zusammengesetztes Wort oder eine Phrase bzw. Redewendung verwendet wird, die mit einem anderen Wort oder Worten gekoppelt ist.With conventional speech analyzers, there are different Difficulties listed below. At the Analyze morphemes in a sentence, such as Words etc., it is important and has a significant impact on the result of the analysis to assess whether a particular Word is used alone, or whether it is a compound Word or phrase or phrase used being coupled with another word or words is.

Die herkömmlichen Analysatoren benutzen ein System, um eine Analyse für beide Fälle durchzuführen, wobei angenommen wird, daß aufeinanderfolgende Worte eine Redewendung sowie unabhängige Worte sind, und schließlich eine entsprechende Übersetzung auf der Basis dieses Ergebnisses gewählt wird, oder es wird ein System verwendet, um vorzugsweise die aufeinanderfolgenden Worte als eine Phrase oder Redewendung zu beurteilen. Hierbei erfordert das zuerst angeführte System bei der Verarbeitung eine lange Zeit, während bei dem an zweiter Stelle angeführten System die Möglichkeit groß ist, daß es zu einer fehlerhaften Übersetzung kommt.The conventional analyzers use a system to measure one Perform analysis for both cases, assuming is that successive words are a phrase as well are independent words, and finally a corresponding one Translation is chosen based on this result, or a system is used to preferably choose the successive ones Words as a phrase or phrase judge. The system listed first requires this when processing a long time, while at the second listed system the possibility is great that there is an incorrect translation.

In der morphologischen Analyse, die bei einer Übersetzung durchgeführt wird, wird ein Teil der Sprache oder eine ähnliche andere Information für die Morpheme, wie ein Wort, durch Abrufen eines Wörterbuches erhalten. Im Falle von üblichen Wörtern, wie Hauptwörtern und Verben können sie, da die meisten von ihnen in dem Wörterbuch gespeichert sind, leicht aufgefunden werden, um so eine Information zu erhalten.In the morphological analysis carried out during a translation is carried out becomes part of the language or a similar one other information for the morphemes like a word obtained by calling up a dictionary. In the event of from common words, such as nouns and verbs them since most of them are stored in the dictionary are easily found so as to provide information receive.

Da jedoch Ausdrücke, die beispielsweise eine Länge (m), eine Geschwindigkeit (m/s), eine Beschleunigung (m/s²) und ähnliche andere Einheiten anzeigen, äußerst vielseitige Bedeutungen haben, ist es nicht effizient, alle diese Ausdrücke in dem Wörterbuch zu speichern, da dadurch die Speicherkapazität für die Wörterbuchinformation in unwirtschaftlicher Weise erhöht wird.However, since expressions such as a length (m), a Speed (m / s), acceleration (m / s²) and the like show other units, extremely versatile meanings it is not efficient to have all of these expressions to save in the dictionary, because of this the storage capacity for dictionary information in uneconomical Way is increased.

Im Falle einer morphologischen Analyse eines Satzes kann der numerische Ausdruck in einer bestimmten Sprache nicht immer genau mit demjenigen in anderen Sprachen übereinstimmen. Beispielsweise ist das grundlegende Konzept beim Zählen von Zahlen, d. h. deren Stellenschreibweise zwischen der japanischen und europäischen Sprachen, beispielsweise der englischen oder deutschen Sprache, verschieden. Wenn beispielsweise ein englischer oder deutscher Ausdruck für numerische Werte nämlich "ein-hundert und zweitausendzweihundertundvier" einfach in seine Bestandteile zerlegt wird und nur durch entsprechende Ausdrücke in der japanischen Sprache ersetzt wird, dann werden sie nur als "100 und 2,200 und 4" analysiert. Dies sollte dann richtig letztendlich als "102,204", d. h. in die japanische Sprache übersetzt werden, in welcher "man" 10-tausend,"sen" tausend und "hyaku" hundert bedeutet. In der morphologischen Analyse, die bei einer Übersetzung durchgeführt worden ist, wird ein eingegebener Satz mit Hilfe von Abgrenzungen, wie Zwischenraum, Komma und Doppelpunkt, in Wörterbuch-Bezugseinheiten unterteilt, und in dem Wörterbuch werden sie durch die Wörterbuch-Bezugseinheiten wieder aufgefunden, um so einen Teil der Sprache und andere Information zu erhalten. In diesem Fall kann beispielsweise ein Eigenname in Abhängigkeit von dem Kontext in zwei oder mehr verschiedenen Bedeutungen verwendet werden. Beispielsweise wird "Osaka City" verwendet, um eine Gruppe als das Subjekt auszudrücken, was als "Osaka City festgesetzt ist . . .", und es wird auch in der Bedeutung verwendet, um den Ort als ein Objekt auszudrücken, wie ". . . Osaka City". Da nur eine Bedeutung für jeden Eigennamen gespeichert ist, kann in Abhängigkeit von den Bedeutungen nicht mit derart verschiedenen Verwendungsmöglichkeiten fertig geworden werden, wodurch die Genauigkeit der morphologischen Analyse gemindert wird.In the case of a morphological analysis of a sentence can be the numerical expression in a particular Language is not always exactly the same as in others Languages match. For example, that's basic Concept of counting numbers, d. H. their job notation between the Japanese and European languages, for example the English or German language, different. For example, if an English or German expression for numerical values namely "one-hundred and two thousand two hundred and four" simply in its components is disassembled and only by appropriate expressions in the Japanese language is replaced, then they are only considered "100 and 2,200 and 4" analyzed. This should be correct ultimately as "102.204", i.e. H. in the Japanese language are translated, in which "one" 10 thousand, "sen" thousand and "hyaku" means a hundred. In the morphological analysis, which was carried out during a translation an entered sentence with the help of delimitations, such as space, Comma and colon, in dictionary reference units divided, and in the dictionary they are divided by the dictionary reference units found again, so one Get part of the language and other information. In In this case, for example, a proper name may be dependent from the context in two or more different Meanings are used. For example "Osaka City" used a group as the subject to express what is stated as "Osaka City ...", and it's also used in meaning to the place to express it as an object, like "... Osaka City". There only one meaning saved for each proper name is, depending on the meanings, not with such various uses be, thereby increasing the accuracy of the morphological analysis is reduced.

Beispielsweise kann im Falle eines englischen/deutschen Satzes . . . in dem Central Park hat (had) John Willson ein (a) . . ." der Satz nicht entsprechend morphologisch analysiert werden, wenn nicht erkannt wird, daß er in den Eigennamen "Central Park" und den anderen Eigennamen "John Willson" in diesem Kontext getrennt wird. Auf dieselbe Weise muß im Falle eines englischen/deutschen Satzes . . . in Boston war (was) Mr. Baker . . . erkannt werden, daß er dadurch in einen Eigennamen "Boston" und einen Eigennamen "Mr. Baker" aufzuteilen ist. In dem herkömmlichen System wird jedoch eine Reihe von diesen Eigennamen fehlerhaft als Ganzes als ein einziger Eigenname erkannt.For example, in the case of an English / German Sentence. . . in Central Park, John Willson had a (a). . "the sentence was not analyzed morphologically accordingly if it is not recognized that he is in the Proper name "Central Park" and the other proper names "John Willson" is separated in this context. The same Wise must in the case of an English / German sentence. . . in Boston was (what) Mr. Baker. . . be recognized that he thereby in a proper name "Boston" and a proper name "Mr. Baker" is to be divided. In the conventional system however, a number of these proper names are erroneously called Whole recognized as a single proper name.

Beispielsweise wird beim Analysieren der Morpheme eines englischen Satzes eine Folge von Worten, welche mit einem großen Buchstaben beginnen, im allgemeinen als ein Eigenname erkannt. Wenn jedoch Worte, die mit dem Großbuchstaben beginnen, andauern, ist es nicht immer angemessen, sie als Ganzes als einen Eigennamen zu erkennen. Es gibt einen solchen Fall, wo sie tatsächlich eine Anzahl Eigennamen sind, welche zufällig nacheinander aufscheinen. Bei dem Analysieren von Morphemen, was bei einer Übersetzung durchgeführt wird, werden ein Sprachteil und andere Information durch Abrufen eines Wörterbuchs erhalten. Da in diesem Fall die meisten gewöhnlichen Hauptwörter, Verben usw. in dem Wörterbuch gespeichert werden können, können sie leicht aufgefunden werden, um die entsprechende Information zu erhalten.For example, when analyzing the morphemes, one English sentence a sequence of words, which with a capital letters begin, generally as a proper name recognized. However, if words with capital letters start, persist, it is not always appropriate to consider it To recognize the whole as a proper name. There is one such a case where they actually have a number of proper names are which appear randomly one after the other. In which Analyze morphemes of what translates is performed, a language part and other information obtained by retrieving a dictionary. There in in this case most common nouns, verbs etc. can be stored in the dictionary, they can can be easily found to the appropriate information to obtain.

Da jedoch vielseitige Arten von Wörtern für die Eigennamen vorhanden sind, ist es nicht möglich, alle von ihnen in einem Wörterbuch zu speichern. Folglich können diese Eigennamen, die nicht in dem Wörterbuch registriert sind auch nicht als Eigenname erkannt werden.However, since versatile types of words for proper names are present, it is not possible to put all of them in to save a dictionary. So these proper names, who are not registered in the dictionary too cannot be recognized as a proper name.

In dem Fall, daß eine Zeichenreihe, die in einem ganz bestimmten Muster zusammengefaßt ist, vorhanden ist, ist die Möglichkeit groß, daß es zu einer fehlerhaften grammatikalischen Zergliederung kommt, wenn eine Verarbeitung für normale Worteinheiten, die in üblichen Sätzen angewendet werden, in einer solchen Zeichenfolge angewendet wird, was dann möglicherweise eine sinnlose Übersetzung zur Folge hat.In the event that a string of characters in a very specific Pattern is summarized, is present, is the Great possibility that there is an incorrect grammatical Dissection comes when processing for normal word units applied in common sentences be applied in such a string what may result in a meaningless translation.

Beispielsweise ist es in einem Fall, bei welchem eine englische Zeichenfolge "Sunday, 26. Jan., ′80" lautet, durch die ein bestimmtes Datum ausgedrückt wird, sehr schwierig, diese ins Japanische zu übersetzen, weil dies im Japanischen ausgedrückt werden sollte als "'80-nen, 1-gatsu, 26-nichi, nichi yobi", wobei "nen" Jahr, "gatsu" Monat, "nichi" Tag und "nichiyobi" Sonntag bedeuten. Da ferner für abgeleitete Worte keine Bewertung für den Sprachteil und für ein semantisches Merkmal durchgeführt worden ist, kann in Abhängigkeit von dem jeweiligen Fall keine genaue Übersetzung erhalten werden.For example, in a case where an English String "Sunday, Jan. 26, '80" reads through the a specific date is expressed, very difficult to get this translate into Japanese because this is expressed in Japanese should be called "'80 -nen, 1-gatsu, 26-nichi, nichi yobi ", where" nen "year," gatsu "month," nichi "day and "nichiyobi" mean Sunday. As further for derived words no assessment for the language part and for a semantic Feature has been performed, depending on no exact translation can be obtained for the respective case.

Die Grammatik, die zum Analysieren einer Sprache verwendet worden ist, beispielsweise eine kontex-freie Grammatik (cfg), hat den Nachteil, daß viele aufwendige Lösungen ausgegeben werden, welche dann letztendlich nicht verwendet werden können, selbst wenn eine absteigende (top-down) oder eine aufsteigende (bottem-up) Methode angewendet wird. Viele dieser aufwendigen Lösungen werden ganz offensichtlich als Fehler erkannt, wenn sie in der Praxis gelesen werden. Da jedoch eine Strukturumformung oder eine Übersetzung auch für solche aufwendigen Lösungen durchgeführt wird, und dann die Angemessenheit bezüglich des Ergebnisses in den jeweiligen Verarbeitungsschritten beurteilt wird, läuft dies oft auf eine unwirtschaftliche Verarbeitungszeit hinaus. Beispielsweise hat das englische Wort "let" die Bedeutung eines Befehls und einer Einladung und folglich muß eine grammatikalische (morphologische bzw. syntaktische) Analyse für die jeweiligen Möglichkeiten durchgeführt werden, wodurch der Wirkungsgrad herabgesetzt wird. Ferner ist es schwierig, eine der Bedeutungen auszuwählen.The grammar used to parse a language a context-free grammar, for example (cfg) has the disadvantage that many complex solutions are spent which are then ultimately not used can be, even if a descending (top-down) or an ascending (bottem-up) method is used. Lots these elaborate solutions are quite obvious as Errors recognized when they are read in practice. There however, a structural transformation or a translation too is carried out for such elaborate solutions, and then the adequacy of the outcome in each Processing steps are assessed, this often runs for an uneconomical processing time. For example the English word "let" has the meaning of a Command and an invitation and consequently a grammatical (morphological or syntactic) analysis for the respective possibilities are carried out, whereby the Efficiency is reduced. Furthermore, it is difficult choose one of the meanings.

Außerdem können in der englischen oder deutschen Sprache eine Anzahl Worte mittels Bindestrichen zusammengefaßt werden, wie beispielsweise "Kümmere-dich-um-ihn-Haltung" ("take-care-of-him-attitude"), um dadurch frei eine adjektivische Gruppe vorzubereiten. Es ist jedoch schwierig, dies, wie üblich durch eine gewöhnliche grammatikalische Analyse zu behandeln.You can also speak in English or German a number of words are combined with hyphens, such as "take care of him" ("take-care-of-him-attitude") in order to free one prepare adjective group. However, it is difficult this, as usual, through an ordinary grammatical Treat analysis.

Auch für die Zusatzfrage (tag question) ist die Situation dieselbe. Obwohl die Form der Zusatzfrage in der englischen Sprache äußerst begrenzt ist, erfordert sie eine äußerst komplizierte Verarbeitung bei der üblichen Analysiermethode. Ferner ist es nicht leicht zu bestimmen, welches Verb zu der Zusatzfrage in Beziehung gesetzt ist.The situation is also for the additional question (tag question) the same. Although the form of the additional question in the English Language is extremely limited, it requires an extreme complicated processing with the usual analysis method. Furthermore, it is not easy to determine which verb to which Additional question is related.

Mittels der Erfindung sollen die vorstehenden Schwierigkeiten beseitigt werden. Ein erstes Ziel der Erfindung ist es daher, einen Sprachanalysator zu schaffen, welcher den Umfang des Kopplungsgrades zwischen aufeinanderfolgenden Worten beurteilen kann und welcher beurteilen kann, ob sie, basierend auf diesem Ergebnis eine Phrase sind oder nicht. Ein zweites Ziel der Erfindung ist es, einen Sprachanalysator zu schaffen, welcher eine morphologische Analyse bei einem eingegebenen Satz durchführen kann, welcher eine zusammengesetzte Zeichenfolge enthält, beispielsweise Dimensionierungseinheiten, ohne daß alle derartigen Zeichenfolgen in einem Wörterbuch gespeichert sind. Ein drittes Ziel der Erfindung ist es, einen Sprachanalysator zu schaffen, welcher eine angemessene morphologische Analyse bezüglich eines Ausdruckes durchführen kann, welcher numerische Wert enthält. Ein viertes Ziel der Erfindung ist es, einen Sprachanalysator zu schaffen, welcher einen Eigennamen entsprechend dem Kontext in eine entsprechende Bedeutung übersetzen kann. Ein fünftes Ziel der Erfindung ist es, einen Sprachanalysator zu schaffen, welcher ein morphologisches Analysieren bezüglich eines Ausdruckes richtig durchführen kann, welcher eine Anzahl aufeinanderfolgender Hauptworte enthält. Ein sechstes Ziel der Erfindung ist es, einen Sprachanalysator zu schaffen, welcher einen nicht-registrierten Eigennamen durchführen kann und eine angemessene Analyse für einen Eigennamen durchführen kann, wobei eine Beziehung zwischen Wortgruppen in Betracht gezogen wird, die davor und danach angeordnet sind. Eine siebte Aufgabe der Erfindung ist es, einen Sprachanalysator zu schaffen, welcher Morpheme bezüglich einer Zeichenfolge richtig analysieren kann, welche eine spezielle Bedeutung ausdrücken, indem die durch eine spezielle Regel kombiniert sind. Ein achtes Ziel der Erfindung ist es, einen Sprachanalysator zu schaffen, welcher ein grammatikalisches Merkmal, eine Bedeutung usw. eines erkannten Wortes abschätzen kann, das eine Ableitung entsprechend einer vorbestimmten Regel ist. Ein neuntes Ziel der Erfindung ist es schließlich, einen Sprachanalysator zu schaffen, welcher das strukturelle Merkmal eines eingegebenen Satzes erkennt und eine grammatikalische Analyse entsprechend diesem Merkmal durchführt.By means of the invention, the above difficulties should be solved be eliminated. A first object of the invention is therefore, to create a speech analyzer that is of scale the degree of coupling between successive words can judge and who can judge whether it is based on this result are a phrase or not. A second object of the invention is a speech analyzer to create a morphological analysis on a entered sentence, which is a compound String, such as sizing units, without all such strings in stored in a dictionary. A third object of the invention is to create a speech analyzer which an appropriate morphological analysis regarding a Expression that contains numeric values. A fourth object of the invention is a speech analyzer to create which corresponds to a proper name can translate the context into a corresponding meaning. A fifth object of the invention is a speech analyzer to create what a morphological analyzing can correctly perform regarding an expression which contains a number of consecutive nouns. A sixth object of the invention is a speech analyzer to create which is an unregistered proper name can perform and perform an appropriate analysis for you Proper names can perform a relationship between Phrases taken into account before and after are arranged. A seventh object of the invention is to create a speech analyzer regarding morphemes a string can correctly analyze which express a special meaning by using a special rule are combined. An eighth object of the invention is to create a speech analyzer that is a grammatical feature, meaning etc. of a recognized one Word can estimate that a derivative accordingly a predetermined rule. A ninth aim of the invention is it, after all, a speech analyzer too create which is the structural feature of an entered Recognizes sentence and a grammatical analysis accordingly performs this feature.

Die vorstehend angeführten neun Ziele sind gemäß der Erfindung (in der Reihenfolge der angegebenen neun Ziele) durch einen Sprachanalysator gemäß den Merkmalen in den Ansprüchen 1, 3, 8, 10, 16, 17, 21, 23 und 24 gelöst. Die auf die angeführten Ansprüche mittelbar oder unmittelbar rückbezogenen Unteransprüche sind gemäß der Erfindung vorteilhafte Weiterbildungen und Ergänzungen.The above nine objectives are in accordance with the invention (in order of the specified nine destinations) by a speech analyzer according to the features in the claims 1, 3, 8, 10, 16, 17, 21, 23 and 24 solved. The on the cited claims are directly or indirectly related Subclaims are advantageous according to the invention Further training and additions.

Nachfolgend wird nunmehr ein Sprachanalysator gemäß der Erfindung anhand von neun verschiedenen Ausführungsformen unter Bezugnahme auf die anliegenden Zeichnungen im einzelnen erläutert. Es zeigen:A speech analyzer according to the Invention based on nine different embodiments with reference to the accompanying drawings in detail explained. Show it:

Fig. 1 bis 10 eine erste Ausführungsform eines Sprachanalysators gemäß der Erfindung, der bei einer automatischen Übersetzungseinrichtung Englisch- Japanisch angewendet ist, wobei im einzelnen zeigenAccording, wherein FIG. 1 to 10 show a first embodiment of a speech analyzer of the invention, which is applied to an automatic transmission device English- Japanese in detail

Fig. 1 ein Funktionsblockdiagramm eines Beispiels für den detaillierten Aufbau eines morphologischen Analyseabschnitts; Fig. 1 is a functional block diagram of an example of the detailed configuration of a morphological analysis section;

Fig. 2 ein Funktionsdiagramm des gesamten Aufbaus; Fig. 2 is a functional diagram of the entire structure;

Fig. 3 eine erläuternde Übersicht, anhand welcher ein Beispiel für den Aufbau einer Wörterbuchdatei wiedergegeben ist, welche mit einem höchsten Vorzugsflag versehen ist; Figure 3 is an explanatory view, the basis of which an example of the construction of a dictionary file is shown, which is provided with a highest Vorzugsflag.

Fig. 4 ein Flußdiagramm eines Beispiels einer morphologischen Analyse; Fig. 4 is a flowchart showing an example of a morphological analysis;

Fig. 5 ein Flußdiagramm eines Beispiels für die Eingabeverarbeitung in der morphologischen Analyse;5 is a flowchart of an example of the input processing in the morphological analysis.

Fig. 6 eine erläuternde Übersicht eines Beispiels zum Formen einer eingegebenen Zeichenfolge; Fig. 6 is an explanatory overview of an example of shaping an input character string;

Fig. 7 eine erläuternde Übersicht, in welcher ein Beispiel für ein Wörterbuchabrufen dargestellt ist; Fig. 7 is an explanatory diagram showing an example of dictionary retrieval;

Fig. 8A bis 8D Flußdiagramme, welche ein Beispiel darstellen, wie ein Widerspruch bei dem höchsten Vorzugsflag in der morphologischen Analyse beseitigt wird; Figs. 8A to 8D are flow charts which illustrate an example, is eliminated as a contradiction at the highest Vorzugsflag in the morphological analysis;

Fig. 9 eine erläuternde Übersicht eines Beispiels für den Inhalt eines Puffers für wiederabgerufene Wörterbuch-Information nach einem Verweisen auf das Wörterbuch; Fig. 9 is an explanatory overview of an example of the content of a buffer for retrieved dictionary information after referencing the dictionary;

Fig. 10 eine erläuternde Übersicht eines Beispiels für den Inhalt des Puffers für wiederabgerufene Wörterbuch-Information als Ergebnis der Durchführung einer Beseitigung eines Gegensatzes für das höchste Vorzugsflag; Fig. 10 is an explanatory diagram showing an example of the contents of the retrieved dictionary information buffer as a result of performing a conflict of highest priority flag;

Fig. 11 bis 16 eine zweite Ausführungsform gemäß der Erfindung, wobei zeigen Fig. 11 to 16 according to a second embodiment, in which showing of the invention

Fig. 11 ein Blockdiagramm der Ausführungsform; FIG. 11 is a block diagram of the embodiment;

Fig. 12 eine Tabelle, in der ein Beispiel von Daten dargestellt ist, welche in einer Wörterbuchdatei gespeichert sind; Fig. 12 is a table showing an example of data stored in a dictionary file;

Fig. 13 eine Tabelle, in welcher ein Beispiel von Daten dargestellt ist, welche in einer als Grundeinheit dienenden Wörterbuch-Datei gespeichert sind; FIG. 13 is a table in which is shown an example of data stored in a dictionary serving as basic unit file;

Fig. 14 eine Tabelle, in welcher ein Beispiel von Daten dargestellt sind, welche in einer Wörterbuch- Informations-Konservierungstabelle gespeichert sind; Fig. 14 is a table showing an example of data stored in a dictionary information preservation table;

Fig. 15 ein Flußdiagramm der Arbeitsweise dieser Einrichtung; FIG. 15 is a flowchart of the operation of this device;

Fig. 16 ein Flußdiagramm der Erkennungseinheit; FIG. 16 is a flowchart of the recognition unit;

Fig. 17 bis 29 eine dritte Ausführungsform eines Sprachanalysators gemäß der Erfindung, der bei einer automatischen Übersetzungseinrichtung Englisch-Japanisch angewendet ist, wobei zeigenTo 29 a third embodiment of a speech analyzer which is Fig. 17 according to the invention is applied to an automatic transmission device English-Japanese, and show

Fig. 17 ein Funktionsblockdiagramm eines Beispiels für die detaillierte Struktur einer morphologischen Analyse;17 is a functional block diagram of an example of the detailed structure of a morphological analysis;

Fig. 18 ein Funktionsblockdiagramm der gesamten Struktur;18 is a functional block diagram of the entire structure;

Fig. 19A und 19B Flußdiagramme eines Beispiels einer morphologischen Analyse; FIG. 19A and 19B are flowcharts of an example of a morphological analysis;

Fig. 20 ein Flußdiagramm, in dem ein Beispiel einer kollektiven Anordnung für ein Währungssymbol und für die Einheit in der morphologischen Analyse dargestellt ist; FIG. 20 is a flowchart showing an example of a collective arrangement is shown for a currency symbol, and for the unit in the morphological analysis;

Fig. 21A und 21B Flußdiagramme eines Beispiels von mit Bindestrichen versehenen Ziffern in der morphologischen Analyse; FIG. 21A and 21B are flowcharts of an example of dashes provided with digits in the morphological analysis;

Fig. 22A und 22B Flußdiagramme eines Beispiels zum Verarbeiten aufeinanderfolgender Zahlen in der morphologischen Analyse; FIG. 22A and 22B are flowcharts of an example of processing of consecutive numbers in the morphological analysis;

Fig. 23A und 23B Flußdiagramme eines Beispiels einer kollektiven Anordnung mit einem vorangehenden numerischen Wert in der morphologischen Analyse; FIG. 23A and 23B are flowcharts of an example of a collective arrangement with a previous numerical value in the morphological analysis;

Fig. 24 eine erläuternde Übersicht, in welcher ein Beispiel für die Struktur einer Wörterbuch- Datei dargestellt ist, die mit einem numerischen Flag versehen ist; Fig. 24 is an explanatory diagram showing an example of the structure of a dictionary file provided with a numeric flag;

Fig. 25 eine erläuternde Übersicht eines Beispiels einer eingegebenen Zeichenfolge; Fig. 25 is an explanatory view of an example of an input character string;

Fig. 26A bis 26D erläuternde Übersichten, in welcher der Inhalt der Wörterbuch-Informations-Konservierungstabelle dargestellt ist, die aus dem Wörterbuch für die eingegebene Zeichenfolge, die in Fig. 25 dargestellt ist, in Abhängigkeit von den Verarbeitungsschritten aufgefunden worden ist; FIG. 26A to 26D explanatory overviews, in which the content of the dictionary information preservation table is shown, which has been, found in the dictionary for the input character string shown in Figure 25 as a function of the processing steps.

Fig. 27 eine erläuternde Übersicht eines weiteren Beispiels einer eingegebenen Zeichenfolge; Figure 27 is an explanatory view of another example of the input character string.

Fig. 28 eine erläuternde Übersicht, in welcher der Inhalt einer Währungssymboltabelle einer Positionsnotierungstabelle und einer Dezimalpunkt- Tabelle in dem Wörterbuch dargestellt ist; Fig. 28 is an explanatory diagram showing the contents of a currency symbol table, a position table and a decimal point table in the dictionary;

Fig. 29A bis 29D erläuternde Ansichten, in welchen ein Beispiel der Wörterbuch-Informations-Konservierungstabelle dargestellt ist, die aus dem Wörterbuch für die eingegebene Zeichenfolge, die in Fig. 27 dargestellt ist, entsprechend den Verarbeitungsschritten aufgefunden worden ist; FIG. 29A to 29D are explanatory views illustrating an example of the dictionary information preservation table is shown, which has been, found in the dictionary for the input character string shown in Figure 27 in accordance with the processing steps.

Fig. 30 bis 36 eine vierte Ausführungsform eines Sprachanalysators gemäß der Erfindung, wobei zeigen Figure 30 to 36 a fourth embodiment. A speech analyzer according to the invention, in which show

Fig. 30 ein Blockdiagramm dieser Ausführungsform; FIG. 30 is a block diagram of this embodiment;

Fig. 31 eine Tabelle, in welcher ein Beispiel von Daten wiedergegeben ist, welche in einem anderen Wörterbuch gespeichert sind; Fig. 31 is a table showing an example of data stored in another dictionary;

Fig. 32 ein Flußdiagramm der Arbeitsweise der gesamten Einrichtung; FIG. 32 is a flowchart of the operation of the entire device;

Fig. 33 ein Flußdiagramm der Verarbeitung für den in einem Wörterbuch registrierten Eigennamen; Fig. 33 is a flowchart of processing for the proper name registered in a dictionary;

Fig. 34 ein Flußdiagramm einer Verarbeitung für einen nicht in dem Wörterbuch registrierten Eigennamen; Fig. 34 is a flowchart of processing for a proper name not registered in the dictionary;

Fig. 35 ein Flußdiagramm einer Verarbeitung für mangelhafte Merkmalsinformation; FIG. 35 is a flowchart of processing for defective feature information;

Fig. 36 eine Tabelle eines Beispiels, in welchem Daten, die in der Wörterbuch-Informations-Konservierungstabelle gespeichert sind, nach der Verarbeitung für den eingegebenen Satz geändert werden; FIG. 36 is a table of an example in which data-information-preserving dictionary table stored in the changed after the processing for the input sentence;

Fig. 37 bis 46 eine fünfte Ausführungsform eines Sprachanalysators gemäß der Erfindung, der bei einer automatischen Übersetzungseinrichtung für Englisch-Japanisch angewendet ist, wobei zeigen37 to 46 a fifth embodiment of a speech analyzer which is Fig. According to the invention applied to an automatic transmission device for English-Japanese, and show

Fig. 37 ein Funktionsblockdiagramm eines Beispiels eines detaillierten Aufbaus für den morphologischen Analyseabschnitt; FIG. 37 is a functional block diagram of an example of a detailed construction for the morphological analysis section;

Fig. 38 ein Funktionsblockdiagramm des gesamten Aufbaus; Fig. 38 is a functional block diagram of the whole structure;

Fig. 39 eine erläuternde Übersicht einer Ausführungsform für den Aufbau einer Wörterbuch-Datei; FIG. 39 is an explanatory view of an embodiment for building a dictionary file;

Fig. 40 ein Flußdiagramm eines Beispiels einer morphologischen Analyse für einen Eigennamen; FIG. 40 is a flowchart showing an example of a morphological analysis for a proper name;

Fig. 41 ein Flußdiagramm eines Beispiels der kollektiven Anordnung für Eigennamen, die in einem Wörterbuch für die morphologische Analyse registriert sind; FIG. 41 is a flowchart showing an example of the collective arrangement of proper names, which are registered in a dictionary for the morphological analysis;

Fig. 42 bis 44 Flußdiagramme, eines Beispiels einer Verarbeitung in Abhängigkeit von der Positionsinformation in der morphologischen Analyse eines Eigennamens; Fig. 42 to 44 are flow charts of an example of processing in response to the position information in the morphological analysis of a proper name;

Fig. 45 ein Flußdiagramm eines Beispiels der kollektiven Anordnung für Eigennamen, die nicht in einem Wörterbuch registriert sind, in der morphologischen Analyse; 45 is a flowchart showing an example of the collective arrangement for proper names, that are not registered in a dictionary, in the morphological analysis.

Fig. 46A bis 46F erläuternde Übersichten, welche den Inhalt der Wörterbuch-Informations-Konservierungstabelle darstellen, auf die in dem Wörterbuch beispielsweise für eine eingegebene Zeichenfolge entsprechend den Verarbeitungsschritten Bezug genommen ist; FIG. 46A to 46F are explanatory tables, which information-preserving dictionary table representing the contents of the referenced in the dictionary, for example, an input character string with respect to the processing steps is correspondingly taken;

Fig. 47 bis 52 eine sechste Ausführungsform eines Sprachanalysators gemäß der Erfindung, wobei zeigen Fig. 47 to 52, a sixth embodiment of a speech analyzer according to the invention, in which show

Fig. 47 ein Blockdiagramm dieser Ausführungsform; Fig. 47 is a block diagram of this embodiment;

Fig. 48 eine Tabelle beispielsweise von Daten, die in einem Bezugswörterbuch gespeichert sind; FIG. 48 is a table example of data stored in a reference dictionary;

Fig. 49 ein Flußdiagramm der Arbeitsweise der gesamten Einrichtung; Fig. 49 is a flow chart of the operation of the entire device;

Fig. 50 ein Flußdiagramm der Verarbeitung für die Eigennamen, die in dem Wörterbuch gespeichert sind; Fig. 50 is a flowchart of processing for the proper names stored in the dictionary;

Fig. 51 ein Flußdiagramm der Verarbeitung für Eigennamen, die nicht in dem Wörterbuch registriert sind; Fig. 51 is a flowchart of processing for proper names that are not registered in the dictionary;

Fig. 51 eine Tabelle, in welcher als Beispiel Daten, die in einer Wörterbuch-Informations-Konservierungstabelle des eingegebenen Satzes geändert werden,, In which are by way of example data stored in a dictionary-information preservation table of the input sentence changed Fig. 51 is a table

Fig. 53 bis 57 eine siebte Ausführungsform eines Sprachanalysators gemäß der Erfindung, der in Verbindung mit einer automatischen Übersetzungseinrichtung Englisch-Japanisch angewendet ist, wobei aufweisen53 to 57, a seventh embodiment of a speech analyzer which is Fig. According to the invention used in conjunction with an automatic transmission device English-Japanese, wherein said

Fig. 53 ein Funktionsblockdiagramm einer Ausführungsform für die detaillierte Struktur des morphologischen Analyseabschnitts; FIG. 53 is a functional block diagram of one embodiment for the detailed structure of the morphological analysis section;

Fig. 54 ein Funktionsblockdiagramm des gesamten Aufbaus;54 is a functional block diagram of the entire structure;

Fig. 55A und 55B Flußdiagramme eines Beispiels der morphologischen Analyse, FIG. 55A and 55B are flowcharts of an example of the morphological analysis,

Fig. 56 eine erläuternde Übersicht, in welcher ein Beispiel des Inhalts der Informationstabelle in einem Informationsverarbeitungsabschnitt dargestellt ist; Fig. 56 is an explanatory diagram showing an example of the content of the information table in an information processing section;

Fig. 57 eine erläuternde Übersicht eines Beispiels des Inhalts einer Anpassungstabelle (7128); Fig. 57 is an explanatory overview of an example of the content of an adjustment table (7128);

Fig. 58 bis 63 eine achte Ausführungsform eines Sprachanalysators gemäß der Erfindung, wobei zeigen Fig. 58 to 63, an eighth embodiment of a speech analyzer according to the invention, in which show

Fig. 58 ein Blockdiagramm zur Erläuterung des gesamten Aufbaus; FIG. 58 is a block diagram for explaining the entire structure;

Fig. 59 ein Blockdiagramm zur Erläuterung eines Beispiels einer Verarbeitung von abgeleiteten Worten mittels einer Vorsilbe; FIG. 59 is a block diagram for explaining an example of processing of words derived by means of a prefix;

Fig. 60 ein Blockdiagramm zur Erläuterung eines Beispiels einer Verarbeitung von abgeleiteten Worten mittels einer Nachsilbe; FIG. 60 is a block diagram for explaining an example of processing of words derived by means of a suffix;

Fig. 61 ein Blockdiagramm der gesamten Einzelheiten, die durch Zusammenfügen der Fig. 58 bis 60 entstanden sind; Fig. 61 is a block diagram of the full details obtained by merging Figs. 58 to 60;

Fig. 62 ein Blockdiagramm von weiteren Einzelheiten für einen vollständigen nicht-registrierten Wortverarbeitungsabschnitt in Fig. 61; Fig. 62 is a block diagram of further details for a complete unregistered word processing section in Fig. 61;

Fig. 63 ein Blockdiagramm zur Erläuterung einer Ausführungsform einer automatischen Übersetzungseinrichtung, in welcher die Erfindung angewendet ist; FIG. 63 is a block diagram for explaining an embodiment of an automatic transmission device in which the invention is applied;

Fig. 64 bis 90 eine neunte Ausführungsform eines Sprachanalysators gemäß der Erfindung, der in einer Übersetzungseinrichtung Englisch-Japanisch angewendet ist, wobei zeigen64 to 90, a ninth embodiment of a speech analyzer which is Fig. According to the invention applied in a translation device English-Japanese, and show

Fig. 64 ein Funktionsblockdiagramm des gesamten Aufbaus; 64 is a functional block diagram of the entire structure;

Fig. 65 ein Funktionsblockdiagramm, das zusammengefaßt die Funktion zum Erkennen der strukturellen Anordnung eines eingegebenen englischen Satzes als ein Block darstellt;Which summarized the function of detecting the structural arrangement of Fig 65 is a functional block diagram of an inputted English sentence as a block.

Fig. 66 ein Flußdiagramm eines Beispiels eines Flusses für die kollektive Anordnung eines Blockes bezüglich des eingegebenen Satzes; FIG. 66 is a flowchart showing an example of a flow for the collective arrangement of a block with respect to the input sentence;

Fig. 67 ein Flußdiagramm von Einzelheiten für eine Wortverarbeitung in dem Verarbeitungsfluß; 67 is a flowchart showing the details of a word processing in the processing flow.

Fig. 68 eine erläuternde Übersicht eines Beispiels der Wörterbuchinformation für englische Wörter oder Phrasen, die in einem Wort-Wörterbuch gespeichert sind; FIG. 68 is an explanatory view of an example of the dictionary information for English words or phrases that are stored in a word dictionary;

Fig. 69 eine erläuternde Übersicht eines Beispiels der Tabellendaten für den Block-Anfangszustand- Endzustand und für einen Zweck- und Rollenbewertungszustand, welcher in einer Analysedatei gespeichert ist; Figure 69 is an explanatory view of an example of table data for the block-Anfangszustand- final state and for a purpose and role evaluation state, which is stored in an analysis file.

Fig. 70 eine erläuternde Übersicht eines Beispiels einer kollektiven Anordnung für eine Struktur; FIG. 70 is an explanatory view of an example of a collective arrangement for a structure;

Fig. 71 eine erläuternde Übersicht eines Beispiels einer kollektiven Anordnung für einen Block; Fig. 71 is an explanatory view of an example of a collective arrangement for a block;

Fig. 72 eine erläuternde Übersicht für ein Beispiel von englischer und Wort-Information, welche kollektiv in einem Block angeordnet sind; Figure 72 is an explanatory view for an example of English and word information, which are collectively disposed in a block.

Fig. 73 ein Flußdiagramm eines Beispiels einer Analyseverarbeitung, welche in einem entsprechenden Analyseabschnitt durchgeführt worden ist; Fig. 73 is a flowchart showing an example of an analysis processing which has been performed in a corresponding analyzing section;

Fig. 74 eine erläuternde Übersicht, welche derjenigen in Fig. 68 ähnlich ist, wobei ein Beispiel für den Zugang zu einem Wort-Phrasen-Wörterbuch für den Fall dargestellt ist, daß diese Ausführungsform eine Funktion einer identischen Fallbewertung hat; Fig. 74 is an explanatory diagram similar to that in Fig. 68, showing an example of access to a word-phrase dictionary in the case that this embodiment has an identical case evaluation function;

Fig. 75 eine erläuternde Übersicht, welche derjenigen in Fig. 69 ähnlich ist, wobei ein Beispiel für den Anfangs- und Endzustand eines Blockes und eine Tabelle der Blockvorbereitungsinformation für einen Fall dargestellt ist, daß diese Ausführungsform eine Funktion einer identischen Fallbewertung hat; Fig. 75 is an explanatory diagram similar to that in Fig. 69, showing an example of the start and end states of a block and a table of block preparation information for a case that this embodiment has an identical case evaluation function;

Fig. 76 ein Funktionsblockdiagramm, das demjenigen in Fig. 64 ähnlich ist und den Gesamtaufbau einer Modifikation dieser Ausführungsform darstellt; Fig. 76 is a functional block diagram similar to that in Fig. 64 and shows the whole construction of a modification of this embodiment;

Fig. 77 ein Funktionsblockdiagramm, das demjenigen in Fig. 76 ähnlich ist, um die Funktion einer grammatikalischen Analyse einer (let)-Information im Hinblick auf die modifizierte, in Fig. 76 dargestellte Ausführungsform zusammenzufassen; Fig. 77 is a functional block diagram similar to that in Fig. 76 to summarize the function of a grammatical analysis of (let) information with respect to the modified embodiment shown in Fig. 76;

Fig. 78 eine erläuternde Darstellung eines Beispiels für Wörterbuchinformation, die die (let)-Information für englische Worte und Phrasen enthält, welche in dem Wortspeicher in der modifizierten Ausführungsform gespeichert sind; Fig. 78 is an explanatory diagram of an example of dictionary information containing the (let) information for English words and phrases stored in the word memory in the modified embodiment;

Fig. 79 und 80 erläuternde Darstellungen, welche denjenigen in Fig. 72 ähnlich sind und ein Beispiel der Block- und Wortinformation darstellen, wobei ein englischer Satz, welcher (let)-Information enthält, kollektiv in einem Block angeordnet ist; Figs. 79 and 80 are explanatory diagrams similar to those in Fig. 72 and showing an example of block and word information, an English sentence containing (let) information being collectively arranged in a block;

Fig. 81 ein Flußdiagramm, das demjenigen in Fig. 73 ähnlich ist, und ein Beispiel eines Flusses für eine kollektive Anordnung der (let)-Information bezüglich des eingegebenen englischen Satzes darstellt; Fig. 81 is a flowchart similar to that in Fig. 73 and shows an example of a flow for a collective arrangement of the (let) information with respect to the input English sentence;

Fig. 82 ein Flußdiagramm, das demjenigen in Fig. 73 ähnlich ist, und ein Beispiel für die Analyseverarbeitung darstellt, welche (let)-Information enthält, welche in dem Analyse-Abschnitt in der modifizierten Ausführungsform durchgeführt worden ist; Fig. 82 is a flowchart similar to that in Fig. 73 and shows an example of the analysis processing which contains (let) information which has been performed in the analysis section in the modified embodiment;

Fig. 83A und 83B Flußdiagramme, welche ein Beispiel des Flusses darstellen, um die (let)-Information für den eingegebenen englischen Satz grammatikalisch zu analysieren; FIG. 83A and 83B flow charts illustrating an example of the flow to the (let) information for the inputted English sentence grammar to analyze;

Fig. 84 ein Flußdiagramm eines Beispiels für den Fluß der grammatikalischen Analyse für mit Bindestrichen versehene Worte für den eingegebenen englischen Satz, Fig. 84 is a flow chart of an example of the flow of the grammatical analysis for hyphenated words provided for the entered English sentence,

Fig. 85 eine erläuternde Übersicht, welche derjenigen in Fig. 72 ähnlich ist, und ein Beispiel der Block- und der Wortinformation darstellt, die kollektiv für den eingegebenen englischen Satz angeordnet sind, der ein Bindestrich- Wort in einem Block enthält; Fig. 85 is an explanatory diagram similar to that in Fig. 72, showing an example of the block and word information collectively arranged for the input English sentence containing a hyphen word in a block;

Fig. 86 ein Funktionsblockdiagramm, das demjenigen in Fig. 64 ähnlich ist, und den gesamten Aufbau einer anderen modifizierten Ausführungsform darstellt; Fig. 86 is a functional block diagram similar to that in Fig. 64 and showing the entire structure of another modified embodiment;

Fig. 87 ein Funktionsblockdiagramm, das demjenigen in Fig. 65 ähnlich ist, in welchem eine Funktion einer morphologischen Analyse der Zusatzfrage in dem eingegebenen englischen Satz kollektiv in der modifizierten, in Fig. 86 dargestellten Ausführungsform angeordnet ist; Fig. 87 is a functional block diagram similar to that in Fig. 65 in which a function of a morphological analysis of the supplementary question in the input English sentence is collectively arranged in the modified embodiment shown in Fig. 86;

Fig. 88 und 89 erläuternde Übersichten, welche demjenigen in Fig. 72 ähnlich sind und ein Beispiel einer kollektiv angeordneten Block- und Wort-Information für einen englischen Satz darstellen, der eine Zusatzfrage in einem Block enthält, und Figs. 88 and 89 are explanatory diagrams similar to that in Fig. 72 and show an example of collectively arranged block and word information for an English sentence containing a supplementary question in a block, and

Fig. 90A und 90B Flußdiagramme eines Beispiels einer Analysefolge einer Zusatzfrage für den eingegebenen englischen Satz. FIG. 90A and 90B are flowcharts of an example of an analysis result of a supplementary question for the inputted English sentence.

Nunmehr wird die erste Ausführungsform der Erfindung beschrieben. In Fig. 1 ist der Gesamtaufbau der ersten Ausführungsform dargestellt, in welcher ein Sprachanalysator gemäß der Erfindung in einer automatischen Übersetzungseinrichtung Englisch-Japanisch angewendet ist. Die Erfindung kann selbstverständlich genauso effektiv nicht nur bei einer automatischen Übersetzungseinrichtung zum Übersetzen von Englisch in Japanisch, sondern auch bei irgendwelchen Sprachanalysatoren angewendet werden, bei welchen die Sätze einer eingegebenen Sprache hauptsächlich analysiert werden, um eine bestimmte Sprache in eine andere zu übersetzen.The first embodiment of the invention will now be described. In Fig. 1, the overall structure of the first embodiment is shown in which a speech analyzer is used according to the invention in an automatic transmission device English-Japanese. Of course, the invention can be applied equally effectively not only to an automatic translation device for translating English to Japanese, but also to any language analyzer in which the sentences of an input language are mainly analyzed to translate one language into another.

Die Ausführungsform in Fig. 1 hat einen Eingabeabschnitt 110, über welchen ein englischer Text 1012, welcher ins Japanische zu übersetzen ist, eingegeben wird. Der Eingabeabschnitt 1010 kann beispielsweise eine Tastatur mit Zeichentasten, wie alphanumerische Tasten oder Funktionstasten, eine optische Zeichenleseeinrichtung (OCR) zum Lesen des auf Papier aufgezeichneten, englischen Textes und/oder einen Datei-Speicher aufweisen, um den englischen Text zu lesen, welcher auf einem Speichermedium, wie einer Magnetplatte aufgezeichnet ist. Der englische Text, welcher in dem Eingabeabschnitt 1010 eingegeben ist, wird in einen Vorredigierabschnitt 1014 gelesen, in welchem eine Vorbehandlung für die Übersetzung durchgeführt wird. In diesem Abschnitt werden hauptsächlich eine Satzerkennung und eine Verarbeitung von unbekannten Wörtern durchgeführt. Dies fungiert dann als ein Teil einer morphologischen Analyse.The embodiment in FIG. 1 has an input section 110 , via which an English text 1012 , which is to be translated into Japanese, is input. The input section 1010 may include , for example, a keyboard with character keys such as alphanumeric keys or function keys, an optical character reader (OCR) for reading the English text recorded on paper, and / or a file memory for reading the English text which is written on a Storage medium such as a magnetic disk is recorded. The English text entered in the input section 1010 is read into a pre-editing section 1014 in which a pre-treatment for the translation is performed. This section mainly deals with sentence recognition and processing of unknown words. This then acts as part of a morphological analysis.

Die vorredigierten englischen Daten werden zusammen mit Informationen, welche in der Vorredigierung enthalten sind, an einen morphologischen Analyseabschnitt 1016 übertragen. Der Abschnitt 1016 teilt den Satz unter Bezugnahme auf ein Wort- Wörterbuch 1018, analysiert die Morpheme des englischen Satzes, führt verschiedene Anordnungsarten durch wie eine Verarbeitung für unbekannte Worte, einen Ausdruck für Zeit, einen Ausdruck für Zahlen usw. und führt eine Verarbeitung für den gesamten Satz durch wie eine zusätzliche Befragung und eine identische Fallerkennung. Die Vorschrift für die morphologische Analyse wird in einer Regel-Datei 1036 gespeichert.The pre-edited English data is transmitted to a morphological analysis section 1016 along with information contained in the pre-edited. Section 1016 divides the sentence with reference to a word dictionary 1018 , analyzes the morphemes of the English sentence, performs various types of arrangement such as processing for unknown words, expression for time, expression for numbers, etc., and processing for the entire sentence through like an additional survey and an identical case detection. The rule for the morphological analysis is stored in a rule file 1036 .

Die englischen Daten nach einer Morphem-Analyse werden zusammen mit der Wörterbuch-Information, die durch die morphologische Analyse erhalten worden ist, an einen sogenannten Parsing-Abschnitt I 1020 übertragen. (Hierbei wird nachstehend unter Parsing eine grammatikalische Analyse bzw. eine automatische Syntaxanalyse verstanden). Der Abschnitt I 1020 ist ein Funktionsabschnitt, welcher eine Analyse für die Oberflächenstruktur eines englischen Satzes durchführt, indem eine Grammatikregel bei den englischen Daten angewendet wird, und es werden dann alle strukturellen Möglichkeiten herausgefunden.The English data after a morpheme analysis, together with the dictionary information obtained by the morphological analysis, are transmitted to a so-called parsing section I 1020 . (Here, parsing is understood to mean a grammatical analysis or an automatic syntax analysis). Section I 1020 is a functional section which performs an analysis for the surface structure of an English sentence by applying a grammar rule to the English data, and all structural possibilities are then found.

Die englischen Daten nach der Analyse in dem Abschnitt I 1020 werden zusammen mit dessen Analyse-Information einem Analyse- Abschnitt II 1022 zugeführt. In diesem Abschnitt wird eine Lösung ausgewählt, indem eine Strukturbeschreibung ausgehend von dem Ergebnis der Analyse im Hinblick auf die Oberflächenschicht durch des Abschnitt I angewendet wird. Ein annehmbarer "Parsing-Baum" für den englischen Satz wird auf diese Weise vorbereitet, um dessen Struktur zu bilden. Diese Parsing-Regeln werden auch in der Parsing-Regel-Datei 1036 gespeichert.The English data after the analysis in section I 1020 together with its analysis information is fed to an analysis section II 1022 . In this section, a solution is selected by applying a structural description based on the result of the analysis with regard to the surface layer by section I. An acceptable "parsing tree" for the English sentence is thus prepared to form its structure. These parsing rules are also stored in the parsing rules file 1036 .

Die englischen Daten werden nach der Analyse als Daten für einen "Parsing-Baum" an einen Struktur-Transformationsabschnitt 1034 übertragen. In dem Abschnitt 1034 wird ein entsprechender japanischer Satzbaum aus einem strukturellen Baum, d. h. einer Zwischenstruktur des englischen Satzes, vorbereitet und wird in eine dem japanischen zugrundeliegende Struktur umgesetzt, aus welcher dann japanisch leicht übersetzt werden kann.The English data is transferred to a structure transformation section 1034 after analysis as data for a "parsing tree". In section 1034 , a corresponding Japanese sentence tree is prepared from a structural tree, ie an intermediate structure of the English sentence, and is converted into a structure on which the Japanese is based, from which Japanese can then be easily translated.

Die strukturellen Baumdaten, welche die dem Japanischen zugrundeliegende Struktur zeigen, die auf diese Weise einer Strukturumwandlung unterzogen worden ist, werden an einen Übersetzungsabschnitt 1026 abgegeben, in welchem der übersetzte Satz gebildet wird. Dies ist ein funktioneller Abschnitt, um einen japanischen Satz aus der Baumstruktur des japanischen Strukturbaums zu erzeugen.The structural tree data showing the structure underlying Japanese, which has been subjected to structural conversion in this way, is supplied to a translation section 1026 , in which the translated sentence is formed. This is a functional section for creating a Japanese sentence from the tree structure of the Japanese structure tree.

Die auf diese Weise übersetzten japanischen Satzdaten, d. h. übersetzte Satzdaten, werden dann in einen Nachredigierabschnitt 1030 abgegeben. Der Abschnitt 1030 modifiziert die übersetzten Satzdaten unter Bezugnahme auf das Wörterbuch 10128, wobei eine Information verwendet wird, welche bei der Übersetzung benutzt worden ist, um einen natürlicheren japanischen Satz zu vervollständigen. Die japanischen Satzdaten werden dann einem Ausgabeabschnitt 1032 übertragen und dann als der übersetzte japanische Satz 1034 von dem Ausgabeabschnitt 1032 aus abgegeben. Der Ausgabeabschnitt 1032 weist beispielsweise einen Drucker, ein Display und/oder eine Datei- Speichereinrichtung, wie eine Magnetplatte, auf.The Japanese sentence data translated in this way, ie translated sentence data, is then delivered to a redesigning section 1030 . Section 1030 modifies the translated sentence data with reference to dictionary 10128 , using information that has been used in the translation to complete a more natural Japanese sentence. The Japanese sentence data is then transferred to an output section 1032 and then output as the translated Japanese sentence 1034 from the output section 1032 . The output section 1032 has, for example, a printer, a display and / or a file storage device, such as a magnetic disk.

Der Fluß für die anschließenden Übersetzungsvorgänge werden durch einen Steuerabschnitt 1038 gesteuert, welcher die Steuerung für die gesamte Einrichtung regelt. Das Wort- Wörterbuch 1018 speichert Wörterbuchdaten für die Worte der englischen und japanischen Sprachen, wobei nicht nur das Vokabular, sondern auch verschiedene Informationen, wie eine Verknüpfungsbeziehung d. h. eine gleichzeitig bestehende Beziehung, Bedeutungen, Plural- und Singularformen, Sprachteile usw. in dieser Ausführungsform festgelegt werden. Ferner speichert die Datei 1036 die Regeldaten für die morphologische und syntaktische Analyse.The flow for the subsequent translation operations are controlled by a control section 1038 which regulates control for the entire facility. The word dictionary 1018 stores dictionary data for the words of the English and Japanese languages, specifying not only the vocabulary but also various information such as a linkage relationship, meanings, plural and singular forms, language parts, etc. in this embodiment will. File 1036 also stores the rule data for morphological and syntactic analysis.

Der Steuerabschnitt 1038 ist mit einem Bedienungs-Anzeigeabschnitt 1040 verbunden. Der Abschnitt 1040 hat Bedienungstasten, welche verschiedene Befehle von einem Operator an die erfindungsgemäße Einrichtung geben, wie beispielsweise Übersetzungs-Befehlstasten oder Cursortasten, ein Display oder eine Anzeigeeinrichtung, welche visuell einen eingegebenen englischen Text, einen japanischen Satz als Ergebnis der Übersetzung, Zwischendaten, wie eine Wörterbuchinformation, verschiedene Befehle an den Operator, usw. anzeigt. Die meisten der Bedienungs-Anzeigefunktionen sind so ausgebildet, daß sie in einer Tastatur enthalten sind, wenn diese an dem Eingabeabschnitt 110 oder in einem Display angeordnet ist, falls dies an dem Ausgabeabschnitt 1032 vorgesehen ist.The control section 1038 is connected to an operation display section 1040 . Section 1040 has operator buttons which give various commands from an operator to the device according to the invention, such as translation command buttons or cursor buttons, a display or a display device which visually displays an inputted English text, a Japanese sentence as a result of the translation, intermediate data such as displays dictionary information, various commands to the operator, etc. Most of the operation display functions are designed to be included in a keyboard when disposed on the input section 110 or in a display if provided on the output section 1032 .

In Fig. 1 ist der detaillierte Aufbau für den morphologischen Analyseabschnitt 1016 dargestellt. Der Abschnitt 1016 hat eine Eingabeeinheit 1100, nämlich eine Tastatur für den Eingabeabschnitt 1010 und ein Eingabe-Interface 1104, das ein Interface mit der Datei 1102 für eingegebene Vorlagen darstellt. Das Eingabeinterface 1104 ist mit einem Puffer für eingegebene Zeichenreihen versehen, die zusammen mit den Daten für englische Zeichenreihe in Form von Kodedaten, beispielsweise ASCII von der Eingabeeinheit 1100 oder von der Datei 1102 eingegeben werden, und speichert vorübergehend die Zeichenreihen-Daten. Die eingegebene Zeichenreihe kann diejenige sein, die in dem Abschnitt 1014 vorredigiert worden ist.In Fig. 1, the detailed structure of the morphological analysis section 1016 is shown. Section 1016 has an input unit 1100 , namely a keyboard for the input section 1010 and an input interface 1104 , which represents an interface with the file 1102 for inputted templates. The input interface 1104 is provided with a buffer for input character strings, which are input together with the data for English character strings in the form of code data, for example ASCII from the input unit 1100 or from the file 1102 , and temporarily stores the character series data. The character string entered may be the one that was pre-edited in section 1014 .

Der morphologische Analyse-Abschnitt 1016 weist, wie in Fig. 1 dargestellt ist, einen Verarbeitungsabschnitt 1106, einen Wörterbuch-Bezugsabschnitt 1108, einen Widerspruchs-Beseitigungs- Verarbeitungsabschnitt 1110 und einen Steuerabschnitt 1112 auf. Der Verarbeitungsabschnitt 1106 ist ein Parsing-Funktionsabschnitt zum Durchführen der morphologischen Analyse und weist einen Puffer für abgefragte Wörterbuch- Information, d. h. eine Wörterbuch-Informations-Konservierungstabelle 1110 (siehe Fig. 9) auf. Die morphologische Analyse wird dadurch durchgeführt, daß das Abfragen im Wörterbuch ordnungsgemäß von dem oberen Ende der eingegebenen Zeichenreihe entsprechend der Abfrageschlüssel-(retrieve key) Zeichenreihe befohlen wird und daß die Wörterbuch-Information, die aus dem Wörterbuch-Abfrageabschnitt 1108 dementsprechend erhalten worden ist, in den Puffer 1120 für abgefragte Wörterbuch-Information gespeichert wird, und daß die Verarbeitung eines Vorzugsgrades entsprechend dem höchsten Vorzugsflag durchgeführt wird, wie später noch beschrieben wird.As shown in FIG. 1, the morphological analysis section 1016 has a processing section 1106 , a dictionary reference section 1108 , a contradiction removal processing section 1110 and a control section 1112 . The processing section 1106 is a parsing function section for performing the morphological analysis and has a buffer for queried dictionary information, ie a dictionary information preservation table 1110 (see FIG. 9). The morphological analysis is performed by properly commanding the dictionary to be queried from the top of the input string corresponding to the retrieve key string and by dictionary information obtained from the dictionary query section 1108 accordingly. is stored in the buffer 1120 for queried dictionary information, and that the processing of a preferred degree corresponding to the highest preferred flag is performed, as will be described later.

Der Wörterbuch-Abfrageabschnitt 1108 ist ein Funktionsabschnitt, um die Wörterbuch-Information durch Abfragen des Wort-Wörterbuchs 1018 basierend auf der Abfrageschlüssel- Zeichenreihe herauszunehmen, was von dem Verarbeitungsabschnitt 1106 befohlen worden ist, und um diese dann an den Verarbeitungsabschnitt 1106 zu übertragen.The dictionary query section 1108 is a functional section for extracting the dictionary information by querying the word dictionary 1018 based on the query key string that has been commanded by the processing section 1106 and then transmitting it to the processing section 1106 .

Das Wort-Wörterbuch 1018 speichert Grammatikinformation, wie einen Sprachteil und eine Beugung für den Zugang zu jedem der Worte sowie ein höchstes Vorzugsflag, wie in Fig. 3 für das Beispiel der Zugangsinformation dargestellt ist. Das Wörterbuch ist als eine Wörterbuchdatei mit einem höchsten Vorzugsflag vorgesehen. "Das höchste Vorzugsflag" ist ein Flag, welches den Kopplungsumfang zwischen Worten anzeigt, welche in einem zusammengesetzten Wort oder einer Phrase enthalten sind, die den Wörterbuch-Zugang darstellt, in welcher "0" eine schwache Kopplung oder keine Kopplung anzeigt, während "1" eine starke Kopplung anzeigt. In diesem Fall wird der Sprachgebrauch als eine Phrase für ein zusammengesetztes Wort oder eine Phrase bewertet, die gemäß der Beurteilung eine starke Kopplung hat; andererseits wird die Möglichkeit des Gebrauchs in Form von einzelnen Worten parallel dazu auch in Betracht gezogen.The word dictionary 1018 stores grammar information, such as a speech portion and inflection for access to each of the words, and a highest preference flag, as shown in FIG. 3 for the example of the access information. The dictionary is provided as a dictionary file with a highest preferred flag. "The highest preferred flag" is a flag that indicates the amount of coupling between words contained in a compound word or phrase that represents dictionary entry, in which "0" indicates weak coupling or no coupling, while "1 "indicates a strong coupling. In this case, language usage is evaluated as a phrase for a compound word or a phrase that is judged to have a strong coupling; on the other hand, the possibility of use in the form of individual words is also considered in parallel.

Wie in Fig. 3 veranschaulicht, ist jeder der Zugänge in den Wort-Wortschatz 1018 jeweils für das zusammengesetzte Wort, eine Phrase und einzelne Wörter, welche sie bilden, angeordnet, wobei kein Unterschied zwischen den einzelnen Wörtern und dem zusammengesetzten Wort oder der Phrase gemacht ist. Ferner stellt jede Beugungsform jeweils einen Zugang dar. Wenn es eine Anzahl Beugungsformen gibt, werden sie jeweils als verschiedene Eingänge registriert. Die Art der Beugung wird in dem Beugungsabschnitt angezeigt. Die Situation ist ähnlich bei dem Sprachteil, in welchem die Registrierung für eine Anzahl Sprachteile zugelassen wird, und eine Sprachteilinformation ist für jede von ihnen enthalten. Als weitere Information werden eine Berechenbarkeit oder eine Nicht- Berechenbarkeit für ein Hauptwort, ein transitives oder intrasitives Verb oder ein übersetztes Wort usw. registriert.As illustrated in FIG. 3, each of the additions to word vocabulary 1018 is arranged for the compound word, phrase, and individual words that make up it, with no difference between the individual words and the compound word or phrase is. Furthermore, each diffraction form represents an access. If there are a number of diffraction forms, they are each registered as different inputs. The type of diffraction is shown in the diffraction section. The situation is similar to the language part in which registration for a number of language parts is permitted, and language part information is included for each of them. As further information, a predictability or a non-predictability for a noun, a transitive or intrasitive verb or a translated word etc. is registered.

Beispielsweise ist "get" (erhalten) eine Infinitiv-Form eines Verbs, und das höchste Vorzugsflag ist "0". Die Phrase "get up" (Aufstehen) ist eine Phrase für eine Infinitivform, und deren höchstes Vorzugsflag ist "1". Ferner hat eine Präpositionsgruppe "up to" das höchste Bezugsflag "1", aber eine Wortgruppe wie "white house" (weißes Haus) als das zusammengesetzte Wort hat das höchste Vorzugsflag von "0", und folglich zeigt das letztere, daß der Kopplungsgrad zwischen den Worten gering ist. In Fig. 3 gibt das Symbol ein Leerzeichen an.For example, "get" is an infinitive form of a verb, and the highest preferred flag is "0". The phrase "get up" is a phrase for an infinitive form, and the highest preferred flag is "1". Furthermore, a preposition group "up to" has the highest reference flag "1", but a phrase such as "white house" as the compound word has the highest preference flag of "0", and consequently the latter shows that the degree of coupling between the words is low. In Fig. 3, the symbol indicates a space.

Auf diese Weise enthält die Wörterbuch-Information, welche in dem Abfrageabschnitt 1108 abgefragt worden ist, das höchste Vorzugsflag. Falls "1" für das höchste Vorzugsflag für identische Zeichenreihen oder sich überdeckende Zeichenreihen gesetzt ist, muß ein derartiger Widerspruch beseitigt werden. In dem Abschnitt 1110 wird die Widerspruchsbeseitigung durchgeführt und die anschließende Verarbeitung, wobei bezüglich der Widerspruchs-Beseitigungsvorschrift auf das höchste Vorzugsflag Bezug genommen wird, das in der Datei 1036 gespeichert ist.In this way, the dictionary information that was queried in the query section 1108 contains the highest preferred flag. If "1" is set for the highest preferred flag for identical series of characters or overlapping series of characters, such a contradiction must be eliminated. In section 1110 , the elimination of the objection is carried out and the subsequent processing, whereby reference is made to the highest preferred flag, which is stored in the file 1036 , with regard to the objection-elimination rule.

Die Widerspruchs-Beseitigungsvorschrift wird in der vorliegenden Ausführungsform in der folgenden Reihenfolge (1) bis (3) angewendet, wobei eine Vorzugsauswahl durchgeführt wird.The objection-elimination provision is given in the present Embodiment in the following order (1) to (3) applied, whereby a preferred selection is carried out becomes.

(1) phrase or word whose language part is a verb;
(2) Compound word, phrase, or word with many Word components;
(3) Compound word, phrase or word contained in the front part of the sentence.

Der Gebrauch für das auf diese Weise ausgewählte Wort, d. h. die Parsing-Einheit wird als die aktive Information an dem Puffer 1120 für aufgefundene Wörterbuch-Information in dem Verarbeitungsabschnitt 1016 dargestellt. Die aktive Information zeigt, daß die Parsing-Einheit gültig oder wirksam ist, wenn sie "1" ist, während sie zeigt, daß deren Möglichkeit nicht gebraucht wird, wenn sie "0" ist.The use for the word thus selected, that is, the parsing unit, is represented as the active information on the dictionary information buffer 1120 in the processing section 1016 . The active information shows that the parsing unit is valid or effective if it is "1", while it shows that its possibility is not used if it is "0".

Der Steuerabschnitt 1112 ist ein Funktionsabschnitt zum Regeln und Steuern der Arbeitsweise und der Verarbeitung in jedem der Funktionsabschnitte in dem morphologischen Analyseabschnitt 1016. Der Abschnitt kann in dem Steuerabschnitt 1038 enthalten sein, von welchem aus die Steuerung für die gesamte Einrichtung durchgeführt wird. Das Ergebnis der morphologischen Analyse wird mittels einem Ausgabe-Interface 1114 an den Parsing-Abschnitt I 1020 übertragen. Für den Fall, daß das Ergebnis nicht unmittelbar an den Parsing- Abschnitt I 1020 übertragen wird, wird es einmal in der Parsing-Eingabedatei 1116 und in der Parsing-Wörterbuchinformationsdatei 1118 gespeichert.The control section 1112 is a functional section for regulating and controlling the operation and processing in each of the functional sections in the morphological analysis section 1016 . The section may be included in the control section 1038 from which control for the entire facility is performed. The result of the morphological analysis is transmitted to the parsing section I 1020 by means of an output interface 1114 . In the event that the result is not immediately transmitted to the parsing section I 1020 , it is stored once in the parsing input file 1116 and in the parsing dictionary information file 1118 .

Während in dieser Ausführungsform alle Worte, zusammengesetzte Worte und Phrasen, wobei von der der entsprechenden Position der Wörterbuch-Bezugseinheit gestartet wird, bei einer morphologischen Analyse herausgenommen werden, werden Wörterbuch-Informationen, welche für einzelne Worte erhalten worden sind, welche das zusammengesetzte Wort oder die Phrase bilden, die als eine kollektive Einheit entsprechend dem höchsten Vorzugsflag beurteilt worden ist, ausgeschieden. Das heißt, der Umfang der Kopplung zwischen den Worten in dem Satz wird beurteilt, während auf das höchste Bezugsflag für die Wörterbuch-Information Bezug genommen wird, welche bei der morphologischen Analyse erhalten worden ist. Bei den zusammengesetzten Worten oder Phrasen, die als solche beurteilt worden sind, die eine starke Kopplung haben, wird dies dann so bewertet, daß sie als Phrase in dem Satz verwendet werden; wenn nicht, wird auch die Möglichkeit für einen Gebrauch in Form von einzelnen Worten parallel dazu in Betracht gezogen. Eine solche Verarbeitung durch das höchste Vorzugsflag wird durch die in Fig. 4 dargestellte Folge durchgeführt. Daten für die eingegebenen Zeichenreihen werden von dem Eingabeabschnitt 1110 (1200) aufgenommen; die eingegebene Zeichenreihe wird durch eine Wörterbuch-Bezugseinheit für ein Wiederabfragen in der Wörterbuch-Datei 1018 mit dem höchsten Vorzugsflag (1201) versehen; das Wörterbuch 1018 wird dann dementsprechend wieder abgefragt (1203), zur Endposition des durch die Daten dargestellten Satzes für die eingegebene Zeichenreihe durchgeführt wird (1202); dann wird der Widerspruch für das höchste Vorzugsflag beseitigt, auf 1204 und das Ergebnis der morphologischen Analyse wird an den Parsing-Abschnitt I (1205) abgegeben.While in this embodiment all words, compound words and phrases starting from the corresponding position of the dictionary reference unit are taken out in a morphological analysis, dictionary information obtained for individual words becomes the compound word or form the phrase that has been judged as a collective unit according to the highest preferred flag. That is, the amount of coupling between the words in the sentence is judged while referring to the highest reference flag for the dictionary information obtained in the morphological analysis. The compound words or phrases that have been judged to have strong coupling are then evaluated to be used as a phrase in the sentence; if not, the possibility of using it in the form of individual words in parallel is also considered. Such processing by the highest preferred flag is carried out by the sequence shown in FIG. 4. Data for the input character series is received by the input section 1110 ( 1200 ); the entered series of characters is provided with a highest preference flag ( 1201 ) by a dictionary reference unit for a query in the dictionary file 1018 ; the dictionary 1018 is then queried again accordingly ( 1203 ) and the end position of the sentence represented by the data is carried out for the input character string ( 1202 ); then the contradiction for the highest preferred flag is resolved to 1204 and the result of the morphological analysis is given to parsing section I ( 1205 ).

In der Eingabeverarbeitung (1200) werden die Daten zuerst aus der Datei 1102 oder einer Eingabeeinheit 1100 in den Puffer für eingegebene Zeichenreihen des Eingabe-Interface 1104 gelesen (siehe Fig. 5: 1210). Die Daten für die eingegebene Zeichenreihe werden beispielsweise in der Form von ASCII eingegeben; wenn Daten in der Datei vollständig ausgelesen sind, (beispielsweise wenn das Symbol EOF ausgelesen wird) schreibt der Verarbeitungsabschnitt 1106 einen NULL- Kode in den Puffer für die eingegebene Zeichenreihe als die Schlußposition.In the input processing ( 1200 ), the data are first read from the file 1102 or an input unit 1100 into the buffer for input character strings of the input interface 1104 (see FIG. 5: 1210 ). For example, the data for the entered string of characters is entered in the form of ASCII; when data in the file is completely read out (for example, when the symbol EOF is read out), the processing section 1106 writes a NULL code in the buffer for the input character string as the closing position.

Der Verarbeitungsabschnitt 1106 formt dann die eingegebene Zeichenreihe wieder (1211). Wenn beispielsweise zwei oder mehr Zeichen, die zu einem einem Leerzeichen entsprechenden Zeichen gehören fortgesetzt werden, werden sie korrigiert in einem einzigen Leerzeichen angeordnet. Das einem Zwischenraum entsprechende Zeichen enthält Leerzeichen (welche durch das Symbol dargestellt sind) Tabulieren, Zeilenrücklauf (was durch das Symbol dargestellt ist) usw. Diese einem Zwischenraum entsprechenden Zeichen zwischen dem oberen und dem ersten erscheinenden Zeichen, und nicht das einem Zwischenraum entsprechende Zeichen in dem Puffer für die eingegebene Zeichenreihe werden entfernt.The processing section 1106 then reforms the input character string ( 1211 ). For example, if two or more characters belonging to a character corresponding to a space continue, they will be corrected into a single space. The character corresponding to a space contains spaces (which are represented by the symbol) tabs, line return (which is represented by the symbol) etc. These characters corresponding to a space between the upper and the first appearing characters, and not the character corresponding to a space in the buffer for the entered string is removed.

Beispielsweise wird die eingegebene Zeichenreihe oder AnordnungFor example, the entered string or arrangement

wie in Fig. 6 dargestellt, inas shown in Fig. 6, in

umgeformt. Die Position des Symbols "NULL" zeigt die End- oder Schlußposition des Speichers an.reshaped. The position of the symbol "ZERO" shows the end or end position of the memory at.

Die Wörterbuch-Bezugsabgrenzungen, welche für die herausgenommene Verarbeitungs 1201 der Wörterbuch-Bezugseinheit verwendet worden sind, werden an der Stelle eines alphabetischen Zeichens, eines numerischen Zeichens, eines Apostrophs und anderer Zeichen außer Bindestrich und Absatz sowie außer einem Apostroph, welcher auf Leerzeichen folgt, angeordnet. Der Verarbeitungsabschnitt 1106 hat einen oberen Zeiger für eine Wörterbuch-Referenz, welche zuerst an der obersten Stelle des Puffers gesetzt wird.The dictionary reference boundaries used for the extracted dictionary reference unit processing 1201 are replaced by an alphabetical character, a numeric character, an apostrophe and other characters other than hyphen and paragraph, and excluding an apostrophe that follows spaces. arranged. Processing section 1106 has an upper pointer for a dictionary reference, which is set first at the top of the buffer.

Der Abfrageabschnitt 1108 fragt die Wörterbuchdatei 1018, welche mit dem höchsten Vorzugsflag versehen ist, wobei die Zeichenreihe von dem Zeichen, das durch den oberen Zeiger angezeigt ist, bis zu dem Zeichen, das der nächsten Abgrenzung vorangeht, als die Abfrageschlüssel-Zeichenreihe verwendet wird. Der Wörterbuch-Zugang und die Abfrageschlüssel- Zeichenreihe werden verglichen; wenn beide identisch sind, wird die Wörterbuch-Information hereingelassen (1203). Die Reihe bzw. Anordnung wird beurteilt, wenn die gesamte Zeichenreihe des Zugangs mit zumindest einem Teil der Zeichenreihe übereinstimmt, die von dem oberen Ende aus startet und wenn der Teil unmittelbar nach diesem Teil eine Wörterbuch- Bezugsabgrenzung ein Apostroph oder ein Absatz ist. Wenn beispielsweise, wie in Fig. 7 dargestellt, der obere Zeiger das obere Zeichen "g" in der Abfrageschlüssel-Zeichenreihe anzeigt, stimmenThe query section 1108 queries the dictionary file 1018 which has the highest preferred flag, using the character string from the character indicated by the upper pointer to the character preceding the next delimitation as the query key character string. The dictionary access and the query key string are compared; if both are identical, the dictionary information is admitted ( 1203 ). The order is judged if the entire string of characters of the access matches at least a part of the string starting from the top and if the part immediately after that part is a dictionary reference delimitation, an apostrophe or a paragraph. For example, as shown in Fig. 7, if the upper pointer indicates the upper character "g" in the query key string, it is true

der Wörterbucheingabe hiermit überein.the dictionary entry herewith match.

Die wieder abgefragte Wörterbuchinformation wird in dem Puffer 1120 des Verarbeitungsabschnittes 1106 gespeichert. Zusammen mit dem Lesen werden dann die übereinstimmende Startposition und die Endposition der Zeichenreihe gespeichert. Hierdurch wird die Position der Zeichen in dem Eingabepuffer ordentlich von dem oberen Ende an spezifiziert. Ein Häufungsbereich für die aktive Information ist in dem Puffer 1120 für abgefragte Wörterbuch-Information angeordnet, welches eine Information ist, die anzeigt, ob die abgefragte Wörterbuchinformation für die folgende Verarbeitung wirksam ist oder nicht, wobei bei diesem Schritt alles "1" gesetzt wird. The retrieved dictionary information is stored in the buffer 1120 of the processing section 1106 . The matching start position and the end position of the character series are then stored together with the reading. This properly specifies the position of the characters in the input buffer from the top. An accumulation area for the active information is arranged in the buffer 1120 for queried dictionary information, which is information indicating whether or not the queried dictionary information is effective for the subsequent processing, in which step everything is set to "1".

Anschließend wird der obere Zeiger bezüglich der jeweiligen Wörterbuch-Referenz auf den neuesten Stand gebracht und wird auf das Zeichen unmittelbar nach der Abgrenzung gesetzt, die dem vorhandenen oberen Zeiger am nächsten erscheint, welcher die Zeichenreihe von links nach rechts anzeigt. Die Wörterbuch- Referenz wird dann anschließend durchgeführt. In dem vorerwähnten Beispiel wird das Zeichen an der oberen Seite der Wörterbuch-Referenz zuerst als "I" für "I", dann als "w" für "will" und dann für "g" für "get" angezeigt. Wenn der obere Zeiger den NULL-Code durchläuft, wird beurteilt, ob das die Endposition ist (1202).The upper pointer is then updated with respect to the respective dictionary reference and is placed on the character immediately after the delimitation that appears closest to the existing upper pointer, which shows the row of characters from left to right. The dictionary reference is then carried out. In the above example, the character at the top of the dictionary reference is displayed first as "I" for "I", then as "w" for "will" and then for "g" for "get". When the upper pointer passes through the NULL code, it is judged whether that is the end position ( 1202 ).

In Fig. 9 ist ein Beispiel der Wörterbuch-Information dargestellt, welche auf diese Weise für das Beispiel der vorstehend beschriebenen, eingegebenen, englischen Zeichenreihe aufgefunden worden ist. Fig. 9 shows an example of the dictionary information which has been found in this way for the example of the English character string described above.

Nunmehr wird in Verbindung mit Fig. 8A bis 8D die Widerspruch- Beseitigungs-Verarbeitung 1204 beschrieben, welche mittels des entsprechenden Abschnitts 1110 durchgeführt worden ist. Das in Fig. 8A und 8B dargestellte Flußdiagramm stellt die Verarbeitung für den Fall dar, daß die Positionen für die Worte, wo die höchsten Bezugsflags gesetzt sind, einander überlappen, während das Flußdiagramm in Fig. 8C und 8D die Verarbeitung darstellt, um die Parsing- Einheit, d. h. Elemente mit dem höchsten Vorzugsflag zu entfernen, d. h. eine Verarbeitung, um die aktive Information zu "0" zu machen. In diesen Flußdiagrammen stellt der Hinweise "≦" eine Substitution, das Zeichen "→" einen Hinweis und "P → x" den Inhalt von x dar, der durch die Eingabe des Zeigers p in Besitz genommen ist.The contradiction clearing processing 1204 , which has been performed by means of the corresponding section 1110 , will now be described in connection with FIGS. 8A to 8D. The flowchart shown in Figs. 8A and 8B represents the processing in the case where the positions for the words where the highest reference flags are set overlap, while the flowchart in Figs. 8C and 8D shows the processing for parsing - Unit, ie removing elements with the highest preferred flag, ie processing to make the active information "0". In these flowcharts, the pointer "≦" represents a substitution, the character "→" a pointer and "P → x" the content of x , which is taken possession of by the input of the pointer p .

Zuerst werden ein Satz Worte, von denen jedes mit dem höchsten Bezugsflag "1" versehen ist, und die Positionen in dem Satz, die einander überlappen, festgestellt (Schritte 1220 bis 1223). Dann wird die Eliminationsregel bezüglich des höchsten Bezugsflags bei jedem der festgestellten Sätze angewendet und diejenigen, bei denen es wirksam ist, werden ausgewählt (Schritte 1224 bis 1235).First, a set of words, each with the highest reference flag "1", and the positions in the sentence that overlap each other are determined (steps 1220 to 1223 ). Then the elimination rule regarding the highest reference flag is applied to each of the rates found and those at which it is effective are selected (steps 1224 to 1235 ).

In der vorstehend beschriebenen Ausführungsform ist das höchste Vorzugsflag "1" bei "get up" an der Startposition "8" und an der Endposition "13" und bei "up to" an der Anfangsposition "12" und der Endposition "16" für die ZeichenfolgeIn the embodiment described above, this is highest preferred flag "1" for "get up" at the start position "8" and "13" at the end position and "up to" at the start position "12" and the ending position "16" for the string

wie es in Fig. 9 dargestellt ist, und die Positionen für die Zeichen überlappen dann einander. Dann wird die Regel (1) die oben angeführt ist, zuerst angewendet und es wird beurteilt, ob es ein Verb ist oder nicht, wobei auf den Sprachteil des Konservierungszeigers psave und den Sprachteil des Zeigers p Bezug genommen wird (1224). Da es in diesem Beispiel einem Verb entspricht, wird die Kombination "get up" ausgewählt.as shown in Fig. 9, and the positions for the characters then overlap each other. Then, the rule (1) mentioned above is applied first and it is judged whether it is a verb or not, referring to the speech part of the preservation pointer psave and the speech part of the pointer p ( 1224 ). Since it corresponds to a verb in this example, the combination "get up" is selected.

Wenn der Regel (81) nicht genügt ist, wird die Regel (2) angewendet (1228) und die Länge (lens) für die Zeichenanordnung was sich auf die Eingabe des Konservierungszeigers psave bezieht und die Länge (len) für die Zeichenanordnung was sich auf die Eingabe des Zeigers p bezieht, werden miteinander verglichen. Wenn der Regel (2) auch nicht genügt ist, dann wird die Regel (3) angewendet (1229) und der Positionsstart, welcher sich auf die Startposition für den Konservierungszeiger psave bezieht und der Positionsstart, der sich auf die Startposition des Zeiges p bezieht, werden verglichen. Wenn dann einer der Regeln (1) bis (3) genügt ist, wenn sie in dieser Reihenfolge angewendet werden, wird die aktive Information von "nicht-genügt", d. h. eine nichtwirksame Eingabe zu "NULL" gemacht (1232). Während anderenfalls d. h. die aktive Information anders ist, d. h. eine wirksame Eingabe belassen ist, da sie auf "1" ist (1231). Eine derartige Anwendung der Gegensatz-Beseitigungsregel wird nacheinander durchgeführt, wobei mit dem Zeiger p schrittweise (1234, 1235) bis zu der Endposition für jede der Eingaben fortgeschritten wird, und die aktive Information wird nur für die effektive Eingabe "1" gemacht. Der Zustand des vorerwähnten Beispiels ist in Fig. 10 dargestellt. Beispielsweise wird für die Eingabe "up to" deren aktive Information "0" gemacht.If rule (81) is not satisfied, rule (2) is applied ( 1228 ) and the length (lens) for the character arrangement , which relates to the input of the preservation pointer psave , and the length (len) for the character arrangement , which relates to the input of the pointer p is compared with each other. If rule (2) is also not satisfied, then rule (3) is applied ( 1229 ) and the position start, which relates to the start position for the preservation pointer psave , and the position start, which relates to the start position of the pointer p , are compared. Then, if one of the rules (1) to (3) is sufficient, if they are applied in this order, the active information is "not sufficient", ie an ineffective entry is made "NULL" ( 1232 ). Otherwise, while the active information is different, that is, an effective input is left because it is "1" ( 1231 ). Such application of the contrast elimination rule is performed sequentially, with the pointer p progressing ( 1234, 1235 ) to the end position for each of the inputs, and the active information is made only for the effective input "1". The state of the aforementioned example is shown in Fig. 10. For example, the active information "0" is made for the input "up to".

Dieses Überlappen, sogar ein teilweises für die Position mit der Kombination, in welcher sowohl die aktive Information als auch das höchste Vorzugsflag "1" sind, werden festgestellt (1236 bis 1241) und ihre aktive Information wird zu "0" gemacht (1242, 1249). Die Anwendung einer solchen Gegensatz-Beseitigungsregel wird ordnungsgemäß für jeden der Eingänge durchgeführt, wobei der Zeiger p schrittweise (1243, 1248) zu der Endposition vorrückt und die aktive Information einer nichteffektiven Eingabe wird zu "0" gemacht. Folglich wird die aktive Information für "get" und für "up" "0" gemacht, beispielsweise für die Eingabe "get up" (Fig. 10). Für die Eingabe "white", "white house" und "house" werden, da alle das höchste Vorzugsflag "0" sind, selbst wenn sich die Positionen überlappen, deren aktive Informationen bei "1" erhalten.This overlap, even partial for the position with the combination in which both the active information and the highest preferred flag are "1", are found ( 1236 to 1241 ) and their active information is made "0" ( 1242, 1249 ). The application of such a counter-elimination rule is properly performed for each of the inputs with the pointer p incrementally ( 1243, 1248 ) to the end position and the active information of an ineffective input made "0". Consequently, the active information for "get" and for "up" is made "0", for example for the input "get up" ( FIG. 10). For the input "white", "white house" and "house", since all are the highest preferred flag "0", even if the positions overlap, their active information is given at "1".

Wenn auf diese Weise die Verarbeitung bis unmittelbar vor die Endposition (NULL) durchgeführt worden ist, werden der Inhalt des Eingabepuffers des Eingabe-Interface 1104 und des Puffers 1120 für abgerufene Wörterbuch-Information von dem Ausgabe-Interface durch Interface 1114 an den Parsing-Abschnitt I 1016 abgegeben. Der Inhalt des Puffers 1120 wird nur für die Eingabe abgegeben, für welche "1" als die aktive Information angezeigt wird. Beispielsweise kann der Inhalt des Eingabepuffers in die Parsing-Eingabedatei 1116 geschrieben werden, während der Inhalt des Informationspuffers 1120 in die Parsing-Wörterbuch-Informationsdatei 1118 geschrieben werden kann. Da in diesem Fall sowohl die aktive Information als auch das höchste Vorzugsflag abgegeben werden, ist der Aufbau der Informationsdatei 1118 mit demjenigen des Informationspuffers 1120 identisch. Es kann jedoch auch so vereinbart sein, daß die aktive Information und das höchste Vorzugsflag nicht abgegeben werden. If the processing has been performed until just before the end position (NULL) in this way, the contents of the input buffer of the input interface 1104 and the buffer 1120 for retrieved dictionary information from the output interface through interface 1114 to the parsing section I delivered 1016 . The content of the buffer 1120 is only output for the input for which "1" is displayed as the active information. For example, the content of the input buffer can be written to the parsing input file 1116 , while the content of the information buffer 1120 can be written to the parsing dictionary information file 1118 . In this case, since both the active information and the highest preferred flag are given, the structure of the information file 1118 is identical to that of the information buffer 1120 . However, it can also be agreed that the active information and the highest preferred flag are not given.

Die Erfindung wird nunmehr anhand einer zweiten, in Fig. 11 dargestellten Ausführungsform eines Sprachanalysators gemäß der Erfindung beschrieben, welcher bei einer automatischen Übersetzungseinrichtung für Englisch-Japanisch angewendet ist. Diese Ausführungsform hat einen Eingabeabschnitt 2014, an welchem Daten von einer Eingabeeinrichtung 2010 oder einer Vorlagendatei 2012 eingegeben werden. Die Eingabeeinrichtung 2010 weist beispielsweise eine Tastatur mit Zeichentasten, wie alpha/numerische oder Funktions-Tasten und eine optische Zeichenleseeinrichtung auf, um einen auf Papier aufgezeichneten, englischen Text zu lesen. Die Vorlagendatei 2012 ist eine Speichereinrichtung, bei welcher der englische Text auf ein Speichermedium, wie eine Magnetplatte aufgezeichnet ist.The invention will now be described with reference to a second embodiment of a speech analyzer according to the invention shown in FIG. 11, which is applied to an automatic translation device for English-Japanese. This embodiment has an input section 2014 , on which data is input from an input device 2010 or a template file 2012 . The input device 2010 has, for example, a keyboard with character keys, such as alpha / numeric or function keys and an optical character reading device, in order to read an English text recorded on paper. The template file 2012 is a storage device in which the English text is recorded on a storage medium such as a magnetic disk.

Der Eingabeabschnitt 2014 weist einen Puffer 2014 a für eine eingegebene Zeichenreihe auf und speichert den eingegebenen englischen Satz, welcher von der Eingabeeinrichtung 2010 oder der Vorlagendatei 2012 eingegeben worden ist, in den Puffer 2014 a. Der Eingabeabschnitt 2014 liest den eingegebenen Satz aus, welcher in dem Puffer 2014 a gespeichert ist, und gibt ihn an einen Verarbeitungsabschnitt 2016 ab.The input section 2014 has a buffer 2014 a for an input character string and stores the input English sentence, which was entered by the input device 2010 or the template file 2012 , in the buffer 2014 a . The input section 2014 reads out the input sentence, which is stored in the buffer 2014 a , and delivers it to a processing section 2016 .

Der Verarbeitungsabschnitt 2016 ist ein Funktionsabschnitt, welcher die morphologische Analyse für den eingegebenen Satz, der von dem Eingabeabschnitt 2014 abgegeben worden ist, durch Abfragen einer Wörterbuchdatei durchführt. Der Verarbeitungsabschnitt 2016 weist eine Wörterbuch-Informations- Konservierungstabelle 2016 a auf und speichert die Information, welche durch Abfragen einer Wörterbuchdatei 2022 oder einer Grundeinheit-Wörterbuchdatei 2026, die später noch beschrieben wird, erhalten worden ist, in die Tabelle 2016 a.The processing section 2016 is a functional section that performs the morphological analysis for the input sentence output from the input section 2014 by querying a dictionary file. The processing section 2016 includes a dictionary information preservation table 2016 a and stores the information which is obtained by querying a dictionary file 2022 or a basic unit dictionary file 2026, which will be described later, in the table 2016 a.

Der Verarbeitungsabschnitt 2016 fragt eine Abfrageschlüssel- Zeichenreihe als eine Einheit für den Fall ab, daß das Wörterbuch von der Zeichenreihe ausgehend abgefragt wird, welche den eingegebenen Satz darstellt, welche von dem Eingabeabschnitt 2014 eingegeben worden ist. Die Abfrage- Zeichenreihe wird ordnungsgemäß ausgehend von dem ersten Zeichen der Zeichenreihe, welche den eingegebenen Satz darstellt, entsprechend einer vorherbestimmten Abfrageregel abgefragt. Beispielsweise wird der eingegebene Satz von dem oberen Ende des Satzes an ordnungsgemäß mittels Abgrenzungen, beispielsweise Zwischenräumen, Kommata, usw. aufgeteilt und die aufgeteilten Zeichenreihen werden als die Abfrageschlüssel- Zeichenreihe verwendet. In diesem Fall werden Zeichenreihen, welche Einheiten wie m, k, m/s ausdrücken, als Abfrageschlüssel-Zeichenreihen ausgebildet. Der Verarbeitungsabschnitt 216 sendet die Abfrageschlüssel-Zeichenreihe, welche aus der Zeichenreihe abgefragt worden ist, welche den eingegebenen Satz darstellt, an den Wörterbuch- Abfrageabschnitt 2020 ab.The processing section 2016 queries a query key string as a unit in the event that the dictionary is queried from the string representing the input sentence that has been input from the input section 2014 . The query string is properly queried based on the first character of the string representing the entered sentence according to a predetermined query rule. For example, the input sentence is properly split from the top of the sentence by delimitations such as spaces, commas, etc., and the split strings are used as the query key string. In this case, character strings which express units such as m, k, m / s are formed as query key character strings. Processing section 216 sends the query key string that has been retrieved from the string representing the entered sentence to dictionary query section 2020 .

Der Abschnitt 202 fragt eine Wörterbuchdatei 2022 auf der Basis der Abfrage-Zeichenreihen ab, welche von dem Verarbeitungsabschnitt 2016 abgegeben worden sind. In der Wörterbuch- Datei 2022 werden Eingabe- und Grammatik-Information, wie ein Sprachteil, so gespeichert, wie in Fig. 12 dargestellt ist. Wenn eine Eingabe in der Wörterbuchdatei 2022 vorliegt, liegt der Wörterbuch-Abfrageabschnitt 2020 die Sprachteil-Information, usw. dieses Eingangs aus und gibt sie an den Verarbeitungsabschnitt 2016 ab. Wenn keine Eingabe in der Wörterbuchdatei 2022 als Ergebnis der Abfrage der Datei 2022 vorliegt, gibt der Wörterbuch-Abfrageabschnitt 2022 diese Situation an den Verarbeitungsabschnitt 2016 ab.Section 202 queries a dictionary file 2022 based on the query strings that have been issued by processing section 2016 . In the dictionary file 2022 , input and grammar information, such as a speech part, is stored as shown in FIG. 12. When there is an input in the dictionary file 2022 , the dictionary query section 2020 exposes the speech part information, etc. of that input and delivers it to the processing section 2016 . If there is no input to the dictionary file 2022 as a result of the query of the file 2022 , the dictionary query section 2022 passes this situation on to the processing section 2016 .

Der Verarbeitungsabschnitt 2016 speichert die Sprachteil- Information usw., welche von dem Abschnitt 2020 abgefragt worden ist, in eine Wörterbuch-Informations-Konservierungstabelle 216 a. Wenn kein Eingang für die Abfrage-Zeichenreihe in der Datei 2022 vorliegt, gibt der Verarbeitungsabschnitt 2016 die Abfrage-Zeichenreihe an einen Einheiten-Erkennungsabschnitt 2024 ab. The processing section 2016 stores the speech part information etc., which has been retrieved from the section 2020 , in a dictionary information preservation table 216 a. If there is no input to the query string in the file 2022 , the processing section 2016 delivers the query string to a device recognition section 2024 .

Der Abschnitt 2024 fragt eine Grundeinheit-Wörterbuchdatei 2026 auf der Basis der Abfrage-Zeichenreihe ab, welche von dem Verarbeitungsabschnitt 2016 abgegeben worden ist. Die Grundeinheiten-Eingänge werden in der Datei 2026 gespeichert, wie in Fig. 13 dargestellt ist. Wenn die Grundeinheiten- Eingabe in der Datei 2026 vorhanden ist, liest der Einheiten-Erkennungsabschnitt 2024 die Grundeinheit-Eingabe aus. Wenn keine Eingabe in der Datei 2026 vorliegt, wird die Abfrageschlüssel-Zeichenreihe in eine Anzahl Zeichenreihen aufgeteilt, was später noch beschrieben wird, um die Datei 2026 wird mehrmals abgefragt. Wenn dann Grundeinheiten-Eingabe bei mehrmaliger Abfrage der Datei 2026 vorhanden sind, werden eine Anzahl Einheiten-Informationen von den Grundeinheiten- Eingaben erhalten. Wenn keine Grundeinheiten-Eingabe bei einer der Vielzahl Abfragen vorhanden ist, wird eine Information erhalten, welche anzeigt, daß sie nicht in dem Wörterbuch registriert ist.Section 2024 queries a base dictionary file 2026 based on the query string that has been issued by processing section 2016 . The base unit inputs are stored in file 2026 , as shown in FIG. 13. If the basic unit input is present in the file 2026 , the unit recognizing section 2024 reads out the basic unit input. If there is no entry in the file 2026 , the query key string is divided into a number of strings, which will be described later, about the file 2026 is queried several times. Then, if there are basic unit inputs when the file 2026 is queried several times, a number of unit information are obtained from the basic unit inputs. If there is no basic unit entry in one of the plurality of queries, information is obtained which indicates that it is not registered in the dictionary.

Der Abschnitt 2024 gibt die Grundeinheiten-Eingabe, eine zusammengesetzte Einheiteninformation und Information, die anzeigt, daß das Wort nicht in dem Speicher registriert ist, an den Verarbeitungsabschnitt 2016 ab. Dieser (2016) speichert diese Informationen, die von dem Abschnitt 2024 eingegeben worden sind, in die Tabelle 2016 a. Die Tabelle 2016 speichert, um den Eingang für die Abfrageschlüssel-Zeichenreihen und Grammatik-Information, wie einen Sprachteil, zu konservieren, welcher durch Abfragen der Datei 2022 oder der Datei 2026 bezüglich der Abfrageschlüssel-Zeichenreihe erhalten worden ist. Nachdem die Daten in der Tabelle 2016 a gespeichert worden sind, gibt der Verarbeitungsabschnitt 2016 diese Daten zusammen mit dem eingegebenen Satz an das Ausgabe-Interface 2018 ab. Das Interface 2018 gibt den eingegebenen Satz und die Daten für die morphologische Analyse, welche von dem Verarbeitungsabschnitt 2016 abgegeben worden ist, an eine Ausgabeeinheit 2030, wie einen Drucker oder eine Anzeigeeinheit oder an eine Speicherdatei 2032, wie eine Magnetplatte, ab. The section 2024 outputs the basic unit input, composite unit information, and information indicating that the word is not registered in the memory to the processing section 2016 . This ( 2016 ) stores this information, which has been entered by section 2024 , in table 2016 a . The table 2016 stores to preserve the input for the query key string and grammar information, such as a language part, obtained by querying the file 2022 or the file 2026 for the query key string. After the data has been stored in table 2016 a , the processing section 2016 delivers this data together with the input sentence to the output interface 2018 . The interface 2018 outputs the input set and the data for the morphological analysis, which has been output by the processing section 2016 , to an output unit 2030 , such as a printer or a display unit, or to a storage file 2032 , such as a magnetic disk.

Andererseits ist es auch möglich, den eingegebenen Satz und die Daten der morphologischen Analyse, welche von dem Verarbeitungsabschnitt 2016 abgegeben worden ist, direkt in eine (nicht dargestellte) Parsing-Einrichtung einzugeben, um eine Syntaxanalyse, d. h. das Parsing für den eingegebenen Satz in der Parsing-Einrichtung durchzuführen und um ferner basierend auf der Syntaxanalyse einen übersetzten Satz vorzubereiten. Der Steuerabschnitt 2028 dient zum Steuern der Arbeitsweise jedes der Funktionsabschnitte in der erfindungsgemäßen Einrichtung und kann vorteilhafterweise in Form eines Mikroprozessors ausgeführt sein.On the other hand, it is also possible to input the input sentence and the data of the morphological analysis, which has been output by the processing section 2016 , directly into a parsing device (not shown) in order to carry out a syntax analysis, ie the parsing for the input sentence in the Parsing facility and also to prepare a translated sentence based on the syntax analysis. The control section 2028 serves to control the functioning of each of the functional sections in the device according to the invention and can advantageously be designed in the form of a microprocessor.

Die Arbeitsweise der erfindungsgemäßen Einrichtung wird nunmehr anhand des in Fig. 15 dargestellten Flußdiagramms erläutert. Zuerst wird der eingegebene englische Satz aus der Eingabeeinheit 2010 oder aus einer Vorlagendatei 2012 in den Eingabeabschnitt 2014 gelesen (2100). Der in den Eingabeabschnitt 2014 eingelesene Satz wird in dem Puffer 2014 a für eine eingegebene Zeichenreihe gespeichert. Der Satz in dem Puffer 2014 a wird dann ausgelesen und an den Verarbeitungsabschnitt 2016 abgegeben. Wenn in dem Verarbeitungsabschnitt 2016 der Satz eingegeben wird, wird die Wörterbuch- Abfrageeinheit herausgeschnitten (2102). Das heißt, die Zeichenfolge, welche den eingegebenen Satz darstellt, wird gemäß einer vorherbestimmten Regel in eine Abfrage-Schlüssel- Zeichenreihe als die Einheiten zum Abfragen der Wörterbuchdatei 2022 oder der Grundeinheiten-Wörterbuchdatei 2026 aufeinanderfolgend, ausgehend von dem oberen Ende der Zeichenreihe aufgeteilt. Dann wird beurteilt, ob die aufgeteilte Abfragezeichenreihe vorhanden ist oder nicht (2104); wenn sie vorhanden ist, wird die Abfrage-Zeichenreihe an den Wörterbuch-Abfrageabschnitt 2020 abgegeben.The operation of the device according to the invention will now be explained with reference to the flow chart shown in FIG. 15. First, the entered English sentence is read from the input unit 2010 or from a template file 2012 into the input section 2014 ( 2100 ). The sentence read into the input section 2014 is stored in the buffer 2014 a for an input character string. The record in the buffer 2014 a is then read out and delivered to the processing section 2016 . When the sentence is entered in the processing section 2016 , the dictionary interrogator is cut out ( 2102 ). That is, the string representing the input sentence is divided into a query key string as the units for querying the dictionary file 2022 or the basic unit dictionary file 2026 successively, starting from the top of the string, according to a predetermined rule. It is then judged whether or not the split query string exists ( 2104 ); if it is present, the query string is delivered to the dictionary query section 2020 .

Wenn die Abfrage-Zeichenreihe an den Abschnitt 2020 abgegeben ist, fragt dieser (2020) die Datei 2022 bezüglich der Abfrageschlüssel-Zeichenreihe ab (2106). Es wird dann beurteilt, ob diese Zeichenreihe in dem Eingang der Wörterbuchdatei 2022 vorhanden ist oder nicht, wie in Fig. 12 dargestellt ist; wenn der Eingang vorhanden ist, wird die Grammatikinformation, wie ein Sprachteil, der in der Datei 2022 gespeichert ist, ausgelesen, und die ausgelesenen Daten werden an den Verarbeitungsabschnitt 2016 gesendet und in der Tabelle 2016 a gespeichert (2110). Dann wird auf den Schritt 2102 zurückgekehrt, und die Wörterbuch-Abfrageeinheit wird wieder ausgeschaltet.If the query string is submitted to section 2020 , this section ( 2020 ) queries file 2022 for the query key string ( 2106 ). It is then judged whether or not this character string is present in the input of the dictionary file 2022 , as shown in Fig. 12; if the input is present, the grammar information such as a speech part stored in the file 2022 is read out, and the read out data is sent to the processing section 2016 and stored in the table 2016 a ( 2110 ). The process then returns to step 2102 and the dictionary interrogator is turned off again.

Wenn kein Eingang in der Datei 2022 vorhanden ist, sendet der Abschnitt 2020 die Abfrageschlüssel-Zeichenfolge an den Verarbeitungsabschnitt 2016 zurück. Der Abschnitt 2016 gibt dann die Abfrageschlüssel-Zeichenreihe an den Erkennungsabschnitt 2024 ab, in welchem die Einheiten-Erkennung durchgeführt wird (2112).If there is no input to file 2022 , section 2020 sends the query key string back to processing section 2016 . Section 2016 then delivers the query key string to the recognition section 2024 in which the unit recognition is performed ( 2112 ).

Für den Fall, daß die Abfrage-Zeichenreihe von dem Abschnitt 2020 gewöhnliche Worte, wie Hauptwörter und Verben aufweist, da sie meistens die Eingänge in die Wörterbuch-Datei 2022 sind, wird Grammatik-Information, wie ein Sprachteil, aus der Datei 2022 gelesen, und die Daten werden an den Verarbeitungsabschnitt 2016 abgegeben und in der Tabelle 2016 a aufgezeichnet. Wie oben beschrieben, werden Eingänge für gewöhnliche Worte, wie Hauptworte und Verben, gebildet, aber es werden keine Eingaben für die Zeichenreihe gebildet, welche die Einheiten in der Datei 2022 ausdrückt. Folglich wird für den Fall, daß die Abfrageschlüssel-Zeichenreihe eine Zeichenreihe ist, die eine Einheit, wie kg oder m/s ausdrückt, da dies keinen Eingang in der Datei 2022 darstellt, in dem Ablauf zu dem Schritt 2112 für eine Einheiten übergegangen.In the event that the query string from section 2020 has common words such as nouns and verbs, since they are mostly the inputs to dictionary file 2022 , grammar information, such as a language part, is read from file 2022 , and the data is given to the processing section 2016 and recorded in the table 2016 a . As described above, inputs are made for common words such as nouns and verbs, but no inputs are made for the string that expresses the units in file 2022 . Thus, in the event that the query key string is a string that expresses a unit such as kg or m / s since it is not an entry in the file 2022 , the process proceeds to step 2112 for a unit.

Die Einheiten-Erkennungsoperation beim Schritt 2112 wird anhand von Fig. 16 erläutert. Wenn die Abfrageschlüssel- Zeichenreihe, für welche kein Eingang in der Datei 2022 bei dem Abfragen vorhanden ist, von dem Verarbeitungsabschnitt 2016 an den Abschnitt 2024 abgegeben wird, der Zeiger P bei dem Zeichen an dem oberen Ende der Abfrageschlüssel-Zeichenreihe in dem Abschnitt 2024 gesetzt (2200).The unit recognition operation at step 2112 is explained with reference to FIG. 16. When the query key string for which there is no input to the file 2022 when queried is passed from the processing section 2016 to the section 2024 , the pointer P at the character at the top of the query key string in the section 2024 is set ( 2200 ).

Der Abschnitt 2024 fragt dann die Grundeinheiten-Wörterbuchdatei 2026 für die Zeichenreihe ab, welche von dem Zeichen aus beginnt, an welchem der Zeiger P gesetzt ist (2201).Section 2024 then queries the base dictionary file 2026 for the character string starting from the character on which the pointer P is set ( 2201 ).

Bei diesem Abfragen wird dann beurteilt, ob die Grundeinheit, für welche der Eingang in der Datei 2026 vorhanden ist, als eine vollständige Zeichenreihe in der Zeichenreihe, die von dem Zeichen an beginnt, an welchem der Zeiger P gesetzt ist, aufscheint oder nicht, und ob sie von dem Zeichen an gestartet wird, an welchem der Zeiger P gesetzt ist, oder nicht. Es wird nämlich abgefragt, ob eine Zeichenreihe, die eines oder eine Anzahl Zeichen aufweist, wobei bei dem Zeichen begonnen wird, an welchem der Zeiger gesetzt ist, mit irgendeiner der Grundeinheiten, für welche der Eingang in der Datei 2026 vorhanden ist, übereinstimmt oder nicht. Beispielsweise sind für den Fall, daß die Zeichen, bei welchen der Zeiger P gesetzt ist, k, m, s, usw. sind, Eingänge in der Datei 2026 für diese einzelnen Zeichen vorhanden, die bei dem Zeichen beginnen, bei welchem der Zeiger P gesetzt ist, wie in Fig. 13 dargestellt ist.In this query, it is then judged whether or not the basic unit for which the input is available in the file 2026 appears as a complete character string in the character string starting from the character at which the pointer P is set, and whether it is started from the character on which the pointer P is set or not. Namely, a query is made as to whether or not a series of characters comprising one or a number of characters, starting with the character on which the pointer is set, matches any of the basic units for which the input is present in the file 2026 . For example, in the event that the characters at which the pointer P is set are k, m, s, etc., there are inputs in the file 2026 for those individual characters that begin at the character at which the pointer P is is set as shown in Fig. 13.

Der Einheiten-Erkennungsabschnitt 2024 beurteilt, ob die Eingänge in der Datei 2026 als Ergebnis der Abfrage an der Datei 2026 vorhanden sind oder nicht (2204). Wenn der Eingang vorhanden ist, rückt der Zeiger P um die Länge der erkannten Grundeinheit vor (2208). Folglich wird der Zeiger in dem Fall, daß die Grundeinheit k, m, s usw. ist, der Zeiger P um ein Zeichen vorgerückt und dann bei dem nächsten Zeichen in der Abfrageschlüssel-Zeichenreihe gesetzt.The unit recognizing section 2024 judges whether or not the inputs in the file 2026 are present as a result of the query on the file 2026 ( 2204 ). If the input is present, the pointer P advances by the length of the recognized basic unit ( 2208 ). Thus, in the event that the base unit is k, m, s, etc., the pointer P is advanced by one character and then set on the next character in the query key string.

Der Abschnitt 2024 beurteilt dann, ob die Zeichenreihe, die von dem Zeichen aus startet, an welchem der Zeiger P gesetzt ist, weiter vorhanden ist oder nicht (2208). Für den Fall, daß eine solche Zeichenreihe weiterhin vorhanden ist, wird auf den Schritt 2202 zurückgegangen, bei welchem die Datei 2026 wieder für die Zeichenreihe abgefragt wird, die von dem Zeichen aus startet, auf welches der Zeiger P gesetzt ist. Dann beurteilt der Abschnitt, ob der Eingang in der Grundeinheit aufgrund der Abfrage der Datei 2026 vorhanden ist oder nicht; wenn der Eingang vorhanden ist, rückt der Zeiger P um die Länge der erkannten Grundeinheit vor.The section 2024 then judges whether or not the string starting from the character on which the pointer P is set remains ( 2208 ). In the event that such a character string is still present, the process goes back to step 2202 , in which the file 2026 is queried again for the character string which starts from the character to which the pointer P is set. Then the section judges whether or not the input is present in the basic unit due to the query of the file 2026 ; if the input is present, the pointer P advances by the length of the recognized basic unit.

Wenn beim Schritt 2208 die Zeichenreihe, die von dem Zeichen aus startet, bei welchem der Zeiger P gesetzt ist, nicht mehr vorhanden ist, ist das Abrufen an der Datei 2026 beendet worden, d. h. das Erkennen für die zusammengesetzte Einheit ist erfolgreich gewesen.At step 2208, if the series of characters starting from the character for which the pointer P is set is no longer available, the retrieval at the file 2026 has ended, ie the recognition for the composite unit has been successful.

Beispielsweise ist für den Fall, daß die Abrufschlüssel-Zeichenreihe, die an den Abschnitt 2024 abgegeben worden ist, km/s, welches eine Einheit darstellt, die Eingabe in der Datei 2026 nicht vorhanden, da km/s für sich eine komplizierte Einheit ist. Der Zeiger P wird dann zuerst auf k gesetzt (2200), und k wird aus der Grundeinheiten-Wörterbuch- Datei 2026 abgerufen, um das Vorhandensein des Eingangs zu bestätigen (2202). Dann wird der Zeiger P auf m gesetzt (2206), und m wird aus der Datei 2026 abgerufen (2202), um das Vorhandensein des Eingangs in derselben Weise zu bestätigen. Da der Einheiten-Erkennungsabschnitt 2024 einen Schrägstrich(/), einen ausgezogenen Kreis (○) usw. als einen Teil einer Einheit betrachtet, wird der Zeiger P als nächstes auf s gesetzt, wobei er "/" in km/s überspringt (2206). Dann wird s aus der Datei 2026 abgerufen, um das Vorhandensein des Eingangs in derselben Weise zu bestätigen (2202). Da dies für jede der Einheiten k, m und s beim Abrufen in der Datei 2026 vorhanden gewesen ist, wird nunmehr beurteilt, daß km/s eine Zeichenfolge ist, die eine Einheit ausdrückt. Auf diese Weise wird für den Fall, daß Eingänge in der Datei 2026 für alle die Zeichen vorhanden sind, welche die Abrufschlüssel-Zeichenfolge bilden, oder für den Fall, daß Eingänge in der Datei 2026 für alle die Zeichen außer den Symbolen wie Schrägstrich und ausgezogener Kreis, usw. vorhanden sind, die ohnehin als ein Teil der Einheit betrachtet werden, nunmehr beurteilt, daß die Abrufschlüssel-Zeichenfolge eine Zeichenfolge ist, welche die Einheit ausdrückt.For example, in the event that the fetch key string issued to section 2024 is km / s, which is a unit, the input to file 2026 is not present because km / s is a complicated unit in and of itself. The pointer P is then first set to k ( 2200 ) and k is retrieved from the basic unit dictionary file 2026 to confirm the presence of the input ( 2202 ). Then the pointer P is set to m ( 2206 ) and m is retrieved from the file 2026 ( 2202 ) to confirm the presence of the input in the same way. Next, since the unit recognizing section 2024 considers a slash (/), solid circle (○), etc. as part of a unit, the pointer P is set to s, skipping "/" in km / s ( 2206 ) . Then s is retrieved from file 2026 to confirm the presence of the input in the same manner ( 2202 ). Since this was present for each of the units k, m and s when retrieved in the file 2026 , it is now judged that km / s is a string that expresses a unit. In this way, in the event that inputs in file 2026 are present for all of the characters that make up the retrieval key string, or in the event that inputs in file 2026 are available for all of the characters other than symbols such as slash and solid Circle, etc., which are already considered part of the unit, now judge that the retrieval key string is a string that expresses the unit.

Wenn der Einheiten-Erkennungsabschnitt 2024 das Abrufen in der Datei 2026 beendet hat und das Ziel, die zusammengesetzte Einheit zu erkennen, erreicht hat, gibt er die auf diese Weise erhaltene Einheiteninformation an den Verarbeitungsabschnitt 2016 ab, welche dann in der Tabelle 2016 a gespeichert wird (2210). Die Einheitenerkennung ist dann folglich beendet.If the unit recognizing section 2024 fetching finishes in the file in 2026 and has reached the goal to recognize the composite unit, it gives the unit information thus obtained to the processing section 2016 on what then is stored in the table 2016 a ( 2210 ). The unit recognition is then ended.

Wenn beim Schritt 2204 kein Eingang in der Datei 2026 als Ergebnis des Abfragens dieser Datei 2026 für die Zeichenfolge, die von dem Zeichen aus startet, an welchem der Zeiger P gesetzt ist, vorhanden ist, bedeutet dies, daß die Zeichenfolge nicht als eine Grundeinheit oder eine zusammengesetzte Einheit erkannt werden kann. Daher gibt der Abschnitt 2024 die Information ab, die anzeigt, daß die Zeichenfolge ein Wort ist, das nicht in dem Wörterbuch registriert ist d. h. sendet eine Information, die anzeigt, daß das Wort die Einheit nicht ausdrückt, zurück an den Verarbeitungsabschnitt 2016, welcher in der Tabelle 2016 a des Verarbeitungsabschnitts 2016 konserviert wird, wodurch dann die Einheitenerkennung beendet ist.If, at step 2204, there is no input to file 2026 as a result of querying this file 2026 for the string starting from the character on which pointer P is set, it means that the string is not as a basic unit or a composite unit can be recognized. Therefore, the section 2024 outputs the information indicating that the string is a word that is not registered in the dictionary, that is, sends information indicating that the word does not express the unit back to the processing section 2016 , which is shown in FIG Table 2016 is a processing section 2016 of the conserved, which then causes the detection unit is completed.

Wenn die Einheitenerkennung (2112) beendet worden ist, wird in dem Flußdiagramm der Fig. 15 auf Schritt 2101 zurückgegangen, und der Ausschnitt der Wörterbuch-Bezugseinheit wird durch den Verarbeitungsabschnitt 2016 wieder durchgeführt. Nach einem Ausschneiden der Bezugseinheit beurteilt der Abschnitt 2016, ob die herausgeschnittene Einheit noch vorhanden ist oder nicht (2104). Wenn die herausgeschnittene Einheit, d. h. die Abrufschlüssel-Zeichenfolge nicht mehr vorhanden ist, gibt sie die in der Tabelle 2016 a gespeicherte Information mittels des Ausgabe-Interfaces 2018 an die Ausgabeeinheit ab (2114). Die Syntax-Analyse bzw. das sogenannte Parsing für den eingegebenen Satz ist folglich abgeschlossen. Wie oben bezüglich dieser Ausführungsform beschrieben worden ist, wird der eingegebene englische Satz in Abrufschlüsselzeichenfolgen unterteilt und zuerst aus einer gewöhnlichen Wörterbuchdatei 2022 abgerufen; wenn kein Eingang in der Datei 2022 vorhanden ist, wird eine Einheiten-Erkennung durchgeführt. Bei der Einheiten-Erkennung wird die Abrufschlüssel- Zeichenfolge aufgeteilt und durch den Zeiger P angezeigt, und die Grundeinheiten-Wörterbuchdatei 2026 wird bei der jeweiligen aufgeteilten Zeichenreihe abgerufen. Das, was in der Datei 2026 aufgezeichnet ist, oder das, was aus einer Folge von Reihen zusammengesetzt ist, die in der Datei 2026 aufgezeichnet sind, wird daraufhin beurteilt, ob es Zeichenreihen sind, welche Einheiten ausdrücken.When the unit recognition ( 2112 ) has ended, the flowchart of FIG. 15 goes back to step 2101 , and the section of the dictionary reference unit is performed again by the processing section 2016 . After cutting out the reference unit, section 2016 judges whether or not the unit cut out still exists ( 2104 ). If the cut-out unit, i.e. the retrieval key string is no longer available, it outputs the information stored in table 2016 a to the output unit using the output interface 2018 ( 2114 ). The syntax analysis or the so-called parsing for the entered sentence is thus completed. As described above with respect to this embodiment, the entered English sentence is divided into retrieval key strings and first retrieved from an ordinary dictionary file 2022 ; if there is no input in file 2022, device detection is performed. In unit recognition, the fetch key string is split and indicated by the pointer P , and the base unit dictionary file 2026 is fetched at the respective split string. What is recorded in file 2026 or what is composed of a series of rows recorded in file 2026 is then judged whether it is rows of characters which express units.

Da es folglich möglich ist, eine Einheiten-Erkennung sogar für die Zeichenfolge durchzuführen, die eine komplizierte Einheit ausdrückt, indem Grundeinheiten, die in der Datei 2026 gespeichert sind kombiniert werden, kann das sogenannte Parsing durchgeführt werden, das einem vielseitigen Ausdrücken von Einheiten entspricht. Da es außerdem nur notwendig ist, da die Datei 2026 Grundeinheiten, wie beispielsweise k, m, s, . . . usw. speichert und komplizierte Einheiten, die aus diesen zusammengesetzt sind, wie beispielsweise km, ○ km/s, usw. nicht gespeichert zu werden brauchen, kann die Kapazität der Wörterbuchdatei reduziert werden.As a result, since it is possible to perform unit recognition even for the string expressing a complicated unit by combining basic units stored in the file 2026 , so-called parsing can be performed, which corresponds to versatile unit expression. In addition, since it is only necessary that the file 2026 basic units such as k, m, s,. . . etc. stores and complicated units composed of them, such as km, ○ km / s, etc. need not be stored, the capacity of the dictionary file can be reduced.

Anhand von Fig. 18 wird der Gesamtaufbau einer dritten Ausführungsform der Erfindung beschrieben, in welcher der Sprachanalysator gemäß der Erfindung bei einer automatischen Übersetzungseinrichtung für Englisch-Japanisch angewendet ist.The overall structure of a third embodiment of the invention in which the speech analyzer according to the invention is applied to an automatic English-Japanese translation device will be described with reference to FIG. 18.

Diese Ausführungsform weist einen Eingabeabschnitt 3010 auf, über welchen ein englischer Text 3012, welcher ins Japanische zu übersetzen ist, eingegeben wird. Der Eingabeabschnitt 3010 kann beispielsweise eine Tastatur mit Zeichentasten, wie alphanumerischen oder Funktions-Tasten, einen optischen Zeichenleser (OCR) zum Lesen des auf Papier aufgezeichneten englischen Textes und/oder eine Datei-Speichereinrichtung aufweisen, um den englischen Text zu lesen, der auf dem Speichermedium, wie einer Magnetplatte aufgezeichnet ist. Der von dem Abschnitt 3010 eingegebene, englische Text wird in den Vorredigierabschnitt 3014 gelesen, in welchem eine Vorbehandlung für die Übersetzung durchgeführt wird. In diesem Fall werden hauptsächlich ein Erkennen für den Satz und die Verarbeitung von unbekannten Worten durchgeführt. Dies fungiert dann als ein Teil einer morphologischen Analyse. Die englischen Daten werden nach der Vorredigierung zusammen mit der bei der Vorredigierung erhaltenen Information an einen morphologischen Analyseabschnitt 3016 übertragen. Der Abschnitt 3016 analysiert die Morpheme des englischen Satzes, während er durch Abrufen eines Wort-Wörterbuchs 3018 unterteilt wird führt verschiedene Arrangements durch, wie beispielsweise eine Verarbeitung für unbekannte Worte, für ein Hauptwort, für einen Zeitausdruck, für Zahlen usw. und führt die Bearbeitung für den ganzen Satz durch, wie beispielsweise ein einer Zusatzfrage (tag question) und ein Erkennen einer Beifügung. Die morphologischen Analyseregeln werden in einer entsprechenden Regeldatei 3036 gespeichert.This embodiment has an input section 3010 via which an English text 3012 to be translated into Japanese is input. The input section 3010 may include , for example, a keyboard with character keys such as alphanumeric or function keys, an optical character reader (OCR) for reading the English text recorded on paper, and / or a file storage device for reading the English text written on the Storage medium such as a magnetic disk is recorded. The English text input from the section 3010 is read into the pre-editing section 3014 , in which a pre-treatment for the translation is performed. In this case, recognition for the sentence and processing of unknown words are mainly performed. This then acts as part of a morphological analysis. After the pre-editing, the English data is transmitted to a morphological analysis section 3016 together with the information obtained during the pre-editing. Section 3016 analyzes the morphemes of the English sentence while being divided by retrieving a word dictionary 3018 , performs various arrangements, such as processing for unknown words, for a noun, for a time expression, for numbers, etc., and performs the processing for the entire sentence, such as an additional question (tag question) and recognition of an addition. The morphological analysis rules are stored in a corresponding rule file 3036 .

Die englischen Daten werden nach der morphologischen Analyse zusammen mit der Wörterbuch-Information, die bei der morphologischen Analyse erhalten worden ist, in einem Parsing-Abschnitt I 3020 gespeichert. Der Abschnitt I 3020 ist ein Funktionsabschnitt, welcher die Oberflächenschicht-Struktur für den Satz dadurch analysiert, daß eine Grammatikregel bei den englischen Daten angewendet wird, und sie findet alle Möglichkeiten im Hinblick auf die Struktur heraus.After the morphological analysis, the English data are stored in a parsing section I 3020 together with the dictionary information obtained in the morphological analysis. Section I 3020 is a functional section which analyzes the surface layer structure for the sentence by applying a grammar rule to the English data, and it finds out all the possibilities with regard to the structure.

Die englischen Daten, welche der morphologischen Analyse in dem Abschnitt I 3020 unterzogen worden sind, werden zusammen mit der morphologisch analysierten Information an einen Parsing-Abschnitt II 3022 übertragen. In diesem Abschnitt wird eine Lösung durch Anwenden einer strukturellen Beschreibung ausgewählt, welche auf dem Ergebnis der morphologischen Analyse (dem Parsing) im Hinblick auf die Oberflächenschicht-Struktur durch die syntaktische Analyse I beruht, wodurch ein plausibler Parsing-Baum für den englischen Satz vorbereitet wird, um dessen Struktur zu formen. Dieser Parsing- oder Analyseregeln werden in der Regeldati 3036 gespeichert.The English data which have been subjected to the morphological analysis in section I 3020 are transmitted together with the morphologically analyzed information to a parsing section II 3022 . In this section, a solution is selected by applying a structural description based on the result of the morphological analysis (parsing) with respect to the surface layer structure by the syntactic analysis I, thereby preparing a plausible parsing tree for the English sentence to shape its structure. These parsing or analysis rules are stored in the rule data 3036 .

Die englischen Daten, welche der entsprechenden Analyse unterzogen worden sind, werden als die Daten für den sogenannten Parsing-Baum an einen Struktur-Umwandlungsabschnitt 3024 übertragen. Im Abschnitt 3024 wird ein Strukturbaum eines entsprechenden japanischen Satzes aus dem Strukturbaum heraus vorbereitet, welcher eine Zwischenstruktur des englischen Satzes ist, um den japanischen Satz in eine dem japanischen unterliegenden Struktur umzuwandeln, aus welcher dann ein japanischer Satz leicht übersetzt werden kann. Die Strukturbaum-Daten, welche die dem japanischen zugrundeliegende Struktur anzeigen, welche auf diese Weise transformiert worden ist, werden an einen Übersetzungs-Bildungsabschnitt 3026 abgegeben, in welchem dann der übersetzte Satz gebildet wird. Dies ist ein Funktionsabschnitt, um einen japanischen Satz aus der Struktur des japanischen Satzstruktur-Baums zu bilden.The English data which has been subjected to the corresponding analysis is transmitted to a structure converting section 3024 as the data for the so-called parsing tree. In section 3024 , a structure tree of a corresponding Japanese sentence is prepared from the structure tree, which is an intermediate structure of the English sentence, in order to convert the Japanese sentence into a structure underlying the Japanese, from which a Japanese sentence can then be easily translated. The structure tree data indicating the underlying Japanese structure which has been transformed in this way is supplied to a translation forming section 3026 , in which the translated sentence is then formed. This is a functional section to build a Japanese sentence from the structure of the Japanese sentence structure tree.

Die japanischen Satzdaten, die als ein übersetzter Satz vorbereitet sind, d. h. übersetzte Daten werden an einen Vorredigierabschnitt 3030 abgegeben. In dem Abschnitt 3030 werden die Übersetzungsdaten modifiziert, die aus einem Wörterbuch 3018 mit Hilfe von Informationen abgerufen werden, welche in der Übersetzung benutzt worden ist, um einen natürlicheren japanischen Satz zu vervollständigen. Die Daten für den japanischen Satz werden an einen Ausgabeabschnitt 3032 übertragen und von diesem als ein übersetzter japanischer Satz 3034 abgegeben. Der Ausgabeabschnitt 3032 enthält beispielsweise einen Drucker, ein Display und/oder eine Dateispeichereinrichtung, wie beispielsweise eine Magnetplatte.The Japanese sentence data prepared as a translated sentence, that is, translated data is supplied to a pre-editing section 3030 . Section 3030 modifies the translation data that is retrieved from a dictionary 3018 using information that has been used in the translation to complete a more natural Japanese sentence. The data for the Japanese sentence is transferred to an output section 3032 and output from it as a translated Japanese sentence 3034 . The output section 3032 includes, for example, a printer, a display, and / or a file storage device such as a magnetic disk.

Der Fluß einer Reihe von Übersetzungsvorgängen wird durch einen Steuerabschnitt 3038 gesteuert, welcher die Steuerung für die gesamte Einrichtung regelt. Das Wort-Wörterbuch 3018 speichert in dieser Ausführungsform Wörterbuchdaten für englische und japanische Wörter, in dem verschiedene Informationen beschrieben werden, wie eine verbindende Beziehung, d. h. gleichzeitig vorhandene Beziehungen, Bedeutungen, Plural- oder Singularformen, Sprachteile usw., und zwar zusätzlich zu dem Vokabular. Ferner speichert die Datei 3036 Regeldaten für die morphologische und die syntaktische Analyse.The flow of a series of translation operations is controlled by a control section 3038 which regulates control for the entire facility. The word dictionary 3018 in this embodiment stores dictionary data for English and Japanese words, in which various information is described, such as a connecting relationship, that is, concurrent relationships, meanings, plural or singular forms, parts of speech, etc., in addition to the vocabulary . File 3036 also stores control data for morphological and syntactic analysis.

Der Steuerabschnitt 3038 ist mit einem Bedienungs-Anzeigeabschnitt 3040 verbunden. Der Abschnitt 3014 weist beispielsweise Bedienungstasten, wie eine Übersetzungs-Anzeigetaste oder eine Cursortaste auf, um verschiedene Informationen von dem Operator an die entsprechende Einrichtung zu geben, weist ein Display oder eine Anzeigeeinrichtung auf, die visuell den eingegebenen englischen Text, den japanischen Text als Ergebnis einer Übersetzung, Zwischendaten, wie eine Wörterbuch- Information usw. sowie verschiedene Anzeigen für den Operator auf. Viele dieser eine Operation anzeigende Funktionen können so ausgebildet sein, daß sie in einer Tastatur enthalten sind, falls sie in dem Eingabeabschnitt 310 angeordnet ist, oder an einer Anzeige, wenn sie an dem Ausgabeabschnitt 3032 angeordnet ist.The control section 3038 is connected to an operation display section 3040 . The section 3014 has, for example, operation keys such as a translation display key or a cursor key to give various information from the operator to the corresponding device, has a display or a display device that visually shows the inputted English text, the Japanese text as a result a translation, intermediate data such as dictionary information etc. and various displays for the operator. Many of these functions indicative of an operation may be configured to be included in a keyboard if it is located in the input section 310 , or on a display when it is located in the output section 3032 .

In Fig. 17 ist ein Beispiel eines detaillierten Aufbaus für die Verarbeitung von Zahlen in dem morphologischen Analyseabschnitt 3016 dargestellt. Der Abschnitt 3016 weist natürlich auch einen anderen Funktionsabschnitt auf, aber jedoch sind diese Teile, welche unmittelbar an das Verständnis der Erfindung betreffen, hier ebenfalls dargestellt. Die morphologische Analyse wird dadurch durchgeführt, daß das Wörterbuch-Abfragen von dem oberen Ende der eingegebenen Zeichenfolge nacheinander entsprechend der Abrufschlüsselzeichenfolge befohlen wird, und die Verarbeitung für die Wörterbuch-Information die von dem Wörterbuch-Abfrageabschnitt 3104 erhalten worden ist, gemäß einem numerischen Flag durchgeführt wird, was später noch beschrieben wird. FIG. 17 shows an example of a detailed structure for processing numbers in the morphological analysis section 3016 . Section 3016 of course also has another functional section, but these parts which relate directly to the understanding of the invention are also shown here. The morphological analysis is performed by commanding the dictionary query from the upper end of the input string one by one in accordance with the retrieval key string, and processing the dictionary information obtained from the dictionary query section 3104 according to a numerical flag what will be described later.

Der Abschnitt 3016 weist einen Eingangs-Verarbeitungsabschnitt 3100 zum Aufnehmen und Verarbeiten der Daten für die von dem Vorverarbeitungsabschnitt 3014 eingegebene Zeichenfolge auf. Der Abschnitt 3100 ist mit einem Puffer für eine eingegebene Zeichenfolge versehen, welcher die Eingabe von englischen Zeichenfolgedaten in Form von Kodedaten beispielsweise von ASCII-Daten erhält, und die Zeichenfolgedaten vorübergehend speichert.Section 3016 has an input processing section 3100 for receiving and processing the data for the character string input from preprocessing section 3014 . Section 3100 is provided with an input string buffer which receives input of English string data in the form of code data such as ASCII data and temporarily stores the string data.

Die eingegebenen Zeichenfolgedaten, die vorübergehend in dem Abschnitt 3100 gespeichert sind, werden an einen Einheiten- Aufteilabschnitt 3102 abgegeben, welcher die eingegebenen Zeichenfolgedaten in Wörterbuch-Abfrageeinheiten, wie Wörter, aufteilt. Der Abschnitt 3102 ist ein Funktionsabschnitt, um die Wörterbuch-Bezugseinheiten zu unterscheiden, welche eine Abrufschlüssel-Zeichenfolge beim Abrufen des Wörterbuchs 3018 in dem Abschnitt 3104 darstellen. Die Wörterbuch-Bezugsabgrenzungen, die für das Aufteilen der Wörterbuch-Bezugseinheit verwendet worden sind, sind an einer bestimmten Stelle eines englischen Zeichens, eines numerischen Zeichens, eines Apostrophs, eines Zeichens außer einem Bindestrich und einem Absatz sowie bei einem Apostroph angeordnet, welcher auf ein Leerzeichen folgt. Dies ist in 99999 00070 552 001000280000000200012000285919988800040 0002003733674 00004 99880 einer Begrenzungstabelle 3108 gespeichert und hierauf wird beim Aufteilen der Wörterbuch-Bezugseinheit in dem Abschnitt 3102 Bezug genommen. Das Wort-Wörterbuch 3018 enthält insbesondere Information zum Abfragen der aufgeteilten Einheiten. Wie durch das Beispiel der Eingangsinformation in Fig. 24 dargestellt ist, wird dies für den Eingang jeder der Wörterbuch- Bezugseinheiten, z. B. für Worte, Grammatik-Information, wie einen Satzteil sowie zum Unterscheiden einer Anzeige gespeichert, welche anzeigt, daß ein Wort eine Zahl darstellt, d. h. ein numerisches Flag und eine Zahlenwert- Information, welche den Zahlenwert für ein die Zahl anzeigendes Wort anzeigt.The input string data temporarily stored in section 3100 is provided to a unit splitting section 3102 which splits the input string data into dictionary query units such as words. Section 3102 is a functional section to distinguish the dictionary reference units that represent a retrieval key string when retrieving dictionary 3018 in section 3104 . The dictionary reference boundaries that have been used for dividing the dictionary reference unit are arranged at a specific position of an English character, a numeric character, an apostrophe, a character other than a hyphen and a paragraph, and at an apostrophe that precedes one Space follows. This is stored in a limit table 3108 in 99999 00070 552 001000280000000200012000285919988800040 0002003733674 00004 99880 and this is referred to in section 3102 when the dictionary reference unit is split. The word dictionary 3018 contains in particular information for querying the divided units. As illustrated by the example of the input information in Fig. 24, this is done for the input of each of the dictionary reference units, e.g. B. for words, grammar information such as a sentence and to distinguish a display which indicates that a word represents a number, ie a numerical flag and a numerical value information which indicates the numerical value for a word indicating the number.

Wie in Fig. 17 dargestellt, werden sowohl Singular- als auch Pluralformen zusammen mit jedem der Eingänge in dem Wort-Wörterbuch 3018 beschrieben, und jedes von ihnen stellt einen Eingang dar. Das numerische Flag zeigt an, daß ein Wort eine Zahl bedeutet, wenn dafür "1" gesetzt ist. Als weitere Information werden beispielsweise registriert eine Zählbarkeit und eine Nicht-Zählbarkeit für ein Hauptwort, eine Identifizierung für transitive oder intransitive Verben, für übersetzte Worte, usw. Da bei Bezugnahme beispielsweise auf "tausend" dies ein Substantiv ist, das eine Zahl darstellt, ist hierfür das numerische Flag "1", und der numerische Wert ist "1000". Da unter Bezugnahme auf "Faden" dies ein Substantiv, jedoch kein Substantiv ist, das eine Zahl anzeigt, d. h. keine Zahl ist, wird das numerische Flag als "0" registriert.As shown in Fig. 17, both singular and plural forms are described along with each of the inputs in the word dictionary 3018 , and each of them represents an input. The numeric flag indicates that a word means a number if for this "1" is set. As further information, for example, a countability and non-countability for a noun, an identification for transitive or intransitive verbs, for translated words, etc. are registered. Since, for example, when referring to "thousand", this is a noun that represents a number for this the numerical flag "1" and the numerical value is "1000". Since with reference to "thread" this is a noun, but not a noun indicating a number, ie not a number, the numerical flag is registered as "0".

Das Erkennen der Zahl wird daher durch das numerische Flag für den Fall durchgeführt, daß es ein Wort ist, das in dem Wörterbuch 3018 beispielsweise als "eins" oder als "tausend" registriert ist. Sogar die nicht-registrierten Worte, wie beispielsweise eine Folge von Ziffern, wie "123" zwei Satz von Ziffernfolgen mit einem "Punkt" dazwischen, nämlich eine kleine Zahl, wie beispielsweise "10.2" und auch die Folge von numerischen Zeichen mit einem Komma dazwischen, wie beispielsweise "1,000,000" werden ebenfalls als Zahlen erkannt. In der vorliegenden Beschreibung liest der Begriff "numerisches Zeichen" gewöhnlich nicht nur arabische Zahlen, sondern auch einen ausgeschriebenen numerischen Ausdruck, wie "dreizehn" ein. The recognition of the number is therefore carried out by the numerical flag in the event that it is a word which is registered in the dictionary 3018, for example, as "one" or as "thousand". Even the unregistered words, such as a sequence of digits such as "123", two sets of digit sequences with a "dot" between them, namely a small number such as "10.2" and also the sequence of numerical characters with a comma between them , such as "1,000,000" are also recognized as numbers. In the present specification, the term "numeric character" usually reads not only Arabic numerals, but also a full numeric expression such as "thirteen".

Wie in Fig. 28 dargestellt, weist das Wörterbuch 1318 eine Währungssymbol-Tabelle 3018 a, in welcher verschiedene Währungssymbole registriert sind, eine Notationssymbol-Tabelle 3018 b, in welcher Notationssymbole ", ""."" (Zwischenraum)" registriert sind, und eine Dezimalpunkt-Tabelle 3018 c auf, in welcher Dezimalpunkte "."",", usw. registriert sind. Die Tabellen für die Notationssymbole oder die Dezimalpunkte sind angeordnet da "," für das Notalionssymbol verwendet wird, oder "." für den Dezimalpunkt im Japanischen oder Englischen verwendet wird, während ein Zwischenraum oder "Punkt" hauptsächlich für das Notalionssymbol und "," für den Dezimalpunkt in anderen europäischen Sprachen, wie Deutsch oder Französisch, verwendet wird; folglich ist der Gebrauch von Symbolen zwischen zu verarbeitenden Sprachen unterschiedlich.As shown in Fig. 28, the dictionary 1318 has a currency symbol table 3018 a in which various currency symbols are registered, a notation symbol table 3018 b in which notation symbols ","".""(Space)" are registered, and a decimal point table 3018 c , in which decimal points "."",", etc. are registered. The tables for the notation symbols or the decimal points are arranged because "," is used for the notalion symbol, or "." is used for the decimal point in Japanese or English, while a space or "period" is mainly used for the notalion symbol and "," for the decimal point in other European languages such as German or French; consequently, the use of symbols differs between languages to be processed.

Der Wörterbuch-Abrufabschnitt 4104 ist ein Funktionsabschnitt, welcher die Wörterbuchinformation durch Abrufen des Wort- Wörterbuchs 3018 basierend auf der Abruf-Zeichenfolge aufteilt, die von dem Eingabe-Aufteilabschnitt 3102 eingegeben worden sind, und dieselben an die Verarbeitungsabschnitte 3110, 3112 und 3116 überträgt.The dictionary retrieving section 4104 is a functional section which divides the dictionary information by retrieving the word dictionary 3018 based on the retrieving string input from the input dividing section 3102 and transmits it to the processing sections 3110, 3112 and 3116 .

Eine Zusammenstellung für das Aufeinanderfolgen von numerischen Zeichen wird durch die folgenden beiden Verarbeitungen durchgeführt. Wenn Worte als eine Zahl erkannt werden, wie es vorstehend beschrieben ist, wenn auf die nächste Wörterbuch- Bezugseinheit Bezug genommen wird und sie auch als eine Zahl erkannt wird, sie zuerst gemeinsam angeordnet, um sie in einer einzigen Zahl zusammenzufügen. Die Operation wird wiederholt, solange Ziffern nachfolgen. Beispielsweise wird "30 tausend" in "30000" umgeformt, und "1.5 Millionen" wird in "1500000" umgeformt. Wenn dann der numerische Ausdruck weiter fortgeführt wird, wobei "und" dazwischen gesetzt wird wenn alle die Ziffern links von "Null", welche jeweils den Ziffern der numerischen Werte entsprechen, was durch den Zeiger rechts von "und" angezeigt ist, im Hinblick auf die Bedeutung des numerischen Ausdrucks "0" sind, werden sie in eine Zahl zusammengefaßt. Beispielsweise wird "einhundertunddreißig" in "130" zusammengefaßt, während "30 tausend und zweihundert" in "30200" zusammengefaßt wird.A compilation for the sequence of numerical Character is processed through the following two carried out. If words are recognized as a number, like as described above when the next dictionary- Reference unit is referred to and it also as a Number is recognized, they are first arranged together to make them put together in a single number. The surgery is going repeated as long as digits follow. For example "30 thousand" converted to "30000", and "1.5 million" will transformed into "1500000". Then if the numeric expression is continued, with "and" in between if all the digits to the left of "zero", each of which Digits of the numerical values correspond to what is indicated by the Pointer to the right of "and" is displayed with respect to the Meaning of the numerical expression "0", they are in summarized a number. For example, "one hundred and thirty" summarized in "130", while "30 thousand and two hundred "is summarized in" 30200 ".

Nach der Erkennung als Zahl wird dann die notwendige lokale Analyse, das sogenannte Parsing weiter durchgeführt. Bei dieser Verarbeitung ist eine Reihe von sogenannten Parsing-Einheiten, welche durch die Morphem-Betätigungsinformation für jede der Parsing-Einheiten betätigt worden sind, gemeinsam in einer einzigen Parsing-Einheit angeordnet, was auf einer lokalen Analysierregel beruht. Beispielsweise sind ein Währungssymbol und ein numerischer Wert wie "Y1,000" gemeinsam als "tausend Yen" angeordnet, und ein numerischer Wert und eine Einheit wie "1.5 km" angeordnet.After the recognition as a number, the necessary local one becomes Analysis, the so-called parsing continued. At this processing is a series of so-called Parsing units identified by the morpheme actuation information for each of the parsing units are arranged together in a single parsing unit, which is based on a local analysis rule. For example are a currency symbol and a numerical value like "Y1,000" arranged collectively as "thousand yen", and one numerical value and a unit like "1.5 km" arranged.

Diese Anordnungen werden in den Verarbeitungsabschnitten 3110 bis 3122 durchgeführt. Der Verarbeitungsabschnitt 3210 ist ein Funktionsabschnitt zum kollektiven Anordnen einer Zahl zusammen mit einem Währungssymbol oder einer Einheit. Der Verarbeitungsabschnitt 3112 ist ein Funktionsabschnitt zum Durchführen der Numerierung für die Zahl. Ferner ist der Verarbeitungsabschnitt 3114 ein Funktionsabschnitt zum Verarbeiten von Zahlen, die durch einen Bindestrich verbunden sind. Der Verarbeitungsabschnitt 3116 ist ein Funktionsabschnitt zum Verarbeiten von aufeinanderfolgenden numerischen Zeichen.These arrangements are performed in processing sections 3110 through 3122 . The processing section 3210 is a functional section for collectively arranging a number together with a currency symbol or a unit. The processing section 3112 is a functional section for performing numbering for the number. Further, the processing section 3114 is a functional section for processing numbers connected by a hyphen. The processing section 3116 is a functional section for processing successive numeric characters.

Im Hinblick auf die Zahl nach der Zusammenstellung mit einem Währungssymbol oder einer Einheit wird die Zusammenstellung zwischen dem Währungssymbol und der numerischen Zahl zu einem einzigen Hauptwort in dem Verarbeitungsabschnitt 3118 durchgeführt. Ferner wird die Zusammenstellung zwischen der Einheit und dem numerischen Wert zu einem Eigennamen in dem Verarbeitungsabschnitt 3120 durchgeführt. Ferner wird im Fall einer Zahl, welche einer Verarbeitung bezüglich einer Numerierung unterworfen worden ist, im Fall mit Bindestrich versehenen Zahl und für eine fortlaufende Zahl eine Verarbeitung für eine Zusammenstellung mit dem vorausgehenden numerischen Wert in dem Verarbeitungsabschnitt 3122 durchgeführt. Die Wörterbuch-Information für die eingegebene Zeichenfolge, die mit Hilfe einer derartigen Verarbeitung vervollständigt ist, wird in dem geordneten Wörterbuch-Informationspuffer, d. h. in der Wörterbuch-Informations-Konservierungstabelle 3124 gespeichert. Die Ergebnisse der morphologischen Analyse werden von der Tabelle 3124 an den Parsing- Abschnitt I 3020 übertragen. Eine Verarbeitung durch das numerische Flag wird nacheinander durchgeführt, wie in Fig. 19A und 10B dargestellt ist. Die Daten für die eingegebene Zeichenreihe werden an dem Eingabe-Verarbeitungsabschnitt 3100 empfangen, wo eine Eingabe-Verarbeitung durchgeführt wird (3200). Dann teilt der Einheiten-Aufteilabschnitt 3102 die eingegebene Zeichenreihe in Wörterbuch-Bezugseinheiten zum Abfragen des Wörterbuchs 3018 auf (3201). Der Wörterbuch- Abrufabschnitt 3104 sucht das Wörterbuch 3018 dementsprechend ab (3203) und, wenn es einen Wörterbuch-Eingang gibt (3204) prüft er das numerische Flag (3205). Wenn das numerische Flag nicht gesetzt ist, da das Wort keine Zahl ist, wird die Wörterbuch-Information in der Tabelle 3124 gesammelt. Wenn "1" für das numerische Flag gesetzt ist, wird die Zahl in dem Verarbeitungsabschnitt 3112 numeriert (3206) und die kollektive Zusammenstellung 3207 mit dem vorhergehenden Zahlenwert wird in dem Abschnitt 3122 durchgeführt. Wenn diese Verarbeitungen für die Endstelle eines Satzes durchgeführt sind, was durch die eingegebenen Zeichenreihendaten angezeigt ist (3202) wird eine kollektive Zusammensetzung (3209) zwischen dem Währungssymbol oder der Einheit in den Abschnitten 3118 und 3120 durchgeführt, das Ergebnis der morphologischen Analyse wird dann in den syntaktischen Analyseabschnitt I (3020) abgegeben (3210).Regarding the number after the combination with a currency symbol or a unit, the combination between the currency symbol and the numerical number is performed into a single noun in the processing section 3118 . Further, the compilation between the unit and the numerical value into a proper name is performed in the processing section 3120 . Further, in the case of a number that has been subjected to numbering processing, in the case of the hyphenated number and for a consecutive number, processing for compilation with the preceding numerical value is performed in the processing section 3122 . The dictionary information for the input character string, which is completed by such processing, is stored in the ordered dictionary information buffer, that is, in the dictionary information preservation table 3124 . The results of the morphological analysis are transferred from table 3124 to parsing section I 3020 . Processing by the numerical flag is performed sequentially, as shown in Figs. 19A and 10B. The data for the input character string is received at the input processing section 3100 , where input processing is performed ( 3200 ). Then, the unit division section 3102 divides the input character string into dictionary reference units for retrieving the dictionary 3018 ( 3201 ). The dictionary retrieving section 3104 searches the dictionary 3018 accordingly ( 3203 ) and if there is a dictionary entry ( 3204 ) it checks the numerical flag ( 3205 ). If the numeric flag is not set because the word is not a number, the dictionary information is collected in table 3124 . If "1" is set for the numerical flag, the number in the processing section 3112 is numbered ( 3206 ) and the collective assembly 3207 with the previous numerical value is performed in section 3122 . When this processing is done for the end point of a sentence, which is indicated by the input character string data ( 3202 ), a collective composition ( 3209 ) between the currency symbol or the unit in sections 3118 and 3120 is performed, the result of the morphological analysis is then shown in submitted the syntactic analysis section I ( 3020 ) ( 3210 ).

Wenn als ein Ergebnis der Wörterbuch-Referenz es keinen Eingang beim Schritt 3204 gibt, und wenn das Element mit Hilfe eines Bindestrichs angefügt ist, wird eine Verarbeitung für eine mit Bindestrich versehene Zahl (3213) in dem Abschnitt 3114 durchgeführt. Wenn das erste Zeichen kein mit Bindestrich versehenes Zeichen, sondern ein Währungssymbol ist (3214) wird das Währungssymbol allein in der Tabelle 3124 konserviert (3216), und das Währungssymbol wird von der Wörterbuch-Bezugseinheit gestrichen (3217). Wenn das erste Zeichen kein Währungssymbol ist (3214) wird die Verarbeitung für die nachfolgenden numerischen Zeichen (3215) in dem Verarbeitungsabschnitt 3116 durchgeführt. Die Operation wird bis zur Endstelle durchgeführt (3202).If, as a result of the dictionary reference, there is no input at step 3204 and if the element is added with a hyphen, processing for a hyphened number ( 3213 ) is performed in section 3114 . If the first character is not a hyphenated character but a currency symbol ( 3214 ), the currency symbol alone is preserved in table 3124 ( 3216 ) and the currency symbol is deleted from the dictionary reference unit ( 3217 ). If the first character is not a currency symbol ( 3214 ), processing for the subsequent numeric characters ( 3215 ) is performed in processing section 3116 . The operation is carried out to the end point ( 3202 ).

Die kollektive Zusammenstellung (3209) mit dem Währungssymbol und der Einheit wird in dem Verarbeitungsabschnitt 3110 durch den in Fig. 20 dargestellten Verarbeitungsfluß durchgeführt. Zuerst wird bei der Anfangsverarbeitung (3220) der obere Zeiger für die Verarbeitung zuerst an die Oberseite des Puffers gesetzt. Wenn das durch den Zeiger angezeigte Element kein Zahlenwert ist (3221) wird der Zeiger schrittweise verschoben (3226). Für den Fall daß das Zeichen ein numerischer Wert ist, aber kein vorausgehendes Währungssymbol und keine vorausgehende Einheit hat, wird der Zeiger ebenfalls schrittweise weiter verschoben (3222, 3224). Die Verarbeitung wird bis zur Endstelle der Wörterbuch- Bezugseinheit durchgeführt (3227).The collective assembly ( 3209 ) with the currency symbol and the unit is performed in the processing section 3110 by the processing flow shown in FIG. 20. First, in initial processing ( 3220 ), the upper processing pointer is placed first on the top of the buffer. If the element indicated by the pointer is not a numerical value ( 3221 ), the pointer is gradually shifted ( 3226 ). In the event that the character is a numerical value, but has no preceding currency symbol and no preceding unit, the pointer is also gradually shifted further ( 3222, 3224 ). Processing is carried out to the end point of the dictionary reference unit ( 3227 ).

Wenn das Zeichen ein Zahlenwert ist (3222), werden das Währungssymbol und der Zahlenwert zu einem einzigen Hauptwort kollektiv zusammengesetzt (3223). Beispielsweise wird das Währungssymbol und das numerische Zeichen "1,000" zu einem Hauptwort zusammengesetzt. Wenn ferner das vorhergehende Zeichen kein Währungssymbol und das nachfolgende Zeichen eine Einheit ist, werden der Zahlenwert und die Einheit kollektiv zu einem einzigen Hauptwort zusammengesetzt (3225). Beispielsweise sind ein numerisches Zeichen und eine Einheit "1.5 km" kollektiv zu einem einzigen Hauptwort zusammengesetzt. Die Verarbeitung wird bis zur Endstelle der Wörterbuch- Abrufeinheit durchgeführt (3227). If the character is a numerical value ( 3222 ), the currency symbol and the numerical value are collectively combined into a single noun ( 3223 ). For example, the currency symbol and the numeric character "1,000" are combined to form a noun. Furthermore, if the preceding character is not a currency symbol and the subsequent character is a unit, the numerical value and the unit are collectively combined into a single noun ( 3225 ). For example, a numeric character and a unit "1.5 km" are collectively combined into a single noun. Processing is carried out to the end point of the dictionary retrieval unit ( 3227 ).

Die Verarbeitung der mit Bindestrich versehenen Zahl wird in dem Verarbeitungsabschnitt 3114 durch das in Fig. 21A und 21B dargestellte Flußdiagramm durchgeführt. Zuerst wird die mit Bindestrich versehene Wörterbuch-Bezugseinheit bei der Anfangsverarbeitung in dem Puffer gespeichert (3230). Ferner wird der numerische Wert "0" konserviert, und der Bindestrich in der ursprünglichen Wörterbuch-Bezugseinheit wird in einen Zwischenraum geändert. Dann wird die Wörterbuch- Abrufeinheit unterteilt (3231), um ein Abrufen bzw. Auffinden im Wörterbuch durchzuführen (3235). Wenn als Ergebnis des Wörterbuch-Abrufens kein Eingang da ist, d. h. wenn das Wort nicht in dem Wörterbuch registriert ist (3236) wird die ganze mit Bindestrich versehene Wörterbuch-Bezugseinheit als ein nicht im Wörterbuch registriertes Wort in der Tabelle 3124 aufbewahrt (3237).The processing of the hyphened number is performed in the processing section 3114 by the flowchart shown in Figs. 21A and 21B. First, the hyphenated dictionary reference unit is stored in the buffer during initial processing ( 3230 ). Furthermore, the numerical value "0" is preserved and the hyphen in the original dictionary reference unit is changed to a space. Then the dictionary retrieval unit is divided ( 3231 ) to perform retrieval in the dictionary ( 3235 ). If there is no input as a result of dictionary retrieval, that is, if the word is not registered in the dictionary ( 3236 ), the entire hyphenated dictionary reference unit is stored as a word not registered in the dictionary in table 3124 ( 3237 ).

Wenn als Ergebnis des Wörterbuchs-Abrufens ein Eingang erhalten wird (3236), wird geprüft, ob das numerische Flag hierfür "1" ist oder nicht. Wenn das numerische Flag keine "1" ist, bedeutet dies, daß das Zeichen kein numerisches Zeichen ist, und die ganze mit Bindestrich versehene Bezugseinheit wird als ein nicht im Wörterbuch registriertes Wort in der Tabelle 3124 aufbewahrt (3237).If an input is received as a result of the dictionary retrieval ( 3236 ), it is checked whether the numerical flag therefor is "1" or not. If the numeric flag is not a "1", it means that the character is not a numeric character and the entire hyphenated reference unit is kept as a word not registered in the dictionary in table 3124 ( 3237 ).

Wenn "1" bei dem numerischen Flag für den Wörterbucheingang gesetzt ist, numeriert der Abschnitt 3012 die Zahl auf der Basis der Eingangsdaten (3239). Dann wird der numerierte Zahlenwert zu einem zu diesem Zeitpunkt aufbewahrten Zahlenwert addiert (3240) und das Additionsergebnis wird konserviert (3241). Folglich wird beispielsweise "zwei" in "zwanzig- zwei" zu "3020" des "zwanzig" unmittelbar vorher zu "3022" addiert. Die Verarbeitung wird bis zur Endstelle der Wörterbuch-Abrufeinheit durchgeführt (3232).If "1" is set on the numeric flag for the dictionary input, the section 3012 numbers the number based on the input data ( 3239 ). Then the numbered numerical value is added to a numerical value kept at this time ( 3240 ) and the addition result is preserved ( 3241 ). Thus, for example, "two" in "twenty-two" is added to "3020" of "twenty" immediately before to "3022". Processing is carried out up to the end point of the dictionary retrieval unit ( 3232 ).

Wenn schrittweise zu der Endposition vorgerückt wird, wird der Fluß beim Schritt 3232 zu der Verarbeitung 3233 übertragen, und der aufbewahrte Zahlenwert wird als ein Zahlenwert für die gesamte mit Bindestrich versehene Wörterbuch- Bezugseinheit gemacht. Dann wird eine kollektive Zusammensetzung 3207 für den Zahlenwert zusammen mit dem vorherigen Zahlenwert durchgeführt. Nunmehr wird die nachfolgende numerische Zeichenverarbeitung 3215 anhand von Fig. 22A und 22B erläutert, was in dem Verarbeitungsabschnitt 3116 durchgeführt wird. In diesen Flußdiagrammen bedeutet das Symbol "" Substitution. Zuerst wird eine Initialisierung 3250 durchgeführt, bei welcher der aufbewahrte Zahlenwert val-save "0" gesetzt wird, der Parameter "i" auf "1" und der Zeiger p an das obere Ende der Zeichenreihe der Wörterbuch- Bezugseinheit gesetzt wird.If incrementally advanced to the end position, the flow is transferred to processing 3233 at step 3232 and the stored numerical value is made as a numerical value for the entire hyphenated dictionary reference unit. Then a collective composition 3207 for the numerical value is performed together with the previous numerical value. The following numerical character processing 3215 will now be explained with reference to FIGS. 22A and 22B, which is carried out in the processing section 3116 . In these flowcharts, the symbol "" means substitution. First, an initialization 3250 is carried out, in which the stored numerical value val - save is set to "0", the parameter "i" is set to "1" and the pointer p is set to the upper end of the character series of the dictionary reference unit.

Dann wird geprüft, ob das Zeichen *p, das durch den Zeiger p angezeigt ist, ein numerisches Zeichen (3251), ein Notationszeichen (3252) oder ein Dezimalpunkt (3251) ist. Wenn es keines davon ist, wird die gesamte Zeichenreihe als das nicht im Wörterbuch registrierte Wort in der Tabelle 3124 gespeichert (3255). Wenn es ein Dezimalpunkt ist (3253), wird der Parameter (i) mit 10 multipliziert (3254) und der Schritt (3258) wird durchgeführt. Beim Schritt 3258 wird der Zahlenwert num (*p) für das Zeichen *p zu dem aufbewahrten Zahlenwert val-save addiert, um einen neuen aufbewahrten Zahlenwert vorzubereiten. Der Zahlenwert num (*p) ist ein Wert, der das Zeichen (*p) als einen numerischen Wert betrachtet.It is then checked whether the character * p indicated by the pointer p is a numeric character ( 3251 ), a notation character ( 3252 ) or a decimal point ( 3251 ). If it is not one of them, the entire string is stored as the word not registered in the dictionary in table 3124 ( 3255 ). If it is a decimal point ( 3253 ), parameter (i) is multiplied by 10 ( 3254 ) and step ( 3258 ) is performed. At step 3258 , the numerical value num (* p) for the character * p is added to the stored numerical value val - save in order to prepare a new stored numerical value. The numerical value num (* p) is a value that considers the character (* p) as a numerical value.

Wenn beim Schritt 3251 oder 3252 das Zeichen einb numerisches oder ein Notations-Zeichen ist, wird der Schritt 3257 durchgeführt. Beim Schritt 3257 wird der aufbewahrte Zahlenwert val-save mit 10 multipliziert, wozu der Zahlenwert num (*p) für das Zeichen *p addiert wird, um einen neuen aufbewahrten Zahlenwert vorzubereiten.At step 3251 or 3252, if the character is a numeric or a notation character, step 3257 is performed. At step 3257 , the stored numerical value val - save is multiplied by 10, to which the numerical value num (* p) for the character * p is added in order to prepare a new stored numerical value.

Nach diesen Verarbeitungen wird der Zeiger schrittweise weiter gerückt (3259) und die Verarbeitung wird bis zur Endstelle der Wörterbuch-Bezugseinheit wiederholt (3260). Wenn die Endposition für die Zeichenreihe erreicht ist, wird der numerische Wert für die ganze Zeichenreihe als ein konservierter Zahlenwert gebildet (3261) und eine kollektive Zusammenstellung 3207 mit dem vorherigen Zahlenwert wird in dem Abschnitt 3122 durchgeführt. Durch die Verarbeitung werden die nachfolgenden numerischen Zeichen, z. B. "1,000.5" als ein Zahlenwert "1000.5" analysiert.After this processing, the pointer is gradually advanced ( 3259 ) and the processing is repeated to the end point of the dictionary reference unit ( 3260 ). When the end position for the string is reached, the numerical value for the whole string is formed as a conserved numerical value ( 3261 ) and a collective compilation 3207 with the previous numerical value is carried out in section 3122 . The following numerical characters, e.g. B. "1,000.5" analyzed as a numerical value "1000.5".

Die kollektive Zusammenstellung 3207 mit dem vorhergehenden Zahlenwert wird folgendermaßen in dem Abschnitt 3122 durchgeführt. Zuerst wird der Zeiger der Wörterbuch-Tabelle auf eine vorhergehende Stelle der Wörterbuch-Bezugseinheit gesetzt (3270). Wenn an dieser Stelle nichts vorhanden ist, bedeutet dies, daß die erste Stelle in der Konservierungstabelle den Zahlenwert anzeigt, und daß der Zahlenwert für die laufende Wörterbuch-Bezugseinheit in der Tabelle 3124 aufgezeichnet ist (3284). Die aufgezeichnete Stelle ist die Stelle, welche der durch den Zeiger P bezeichneten Stelle am nächsten ist.The collective compilation 3207 with the previous numerical value is carried out in the section 3122 as follows. First, the dictionary table pointer is set to a previous location of the dictionary reference unit ( 3270 ). If nothing is present at this point, it means that the first digit in the preservation table indicates the numerical value and that the numerical value for the current dictionary reference unit is recorded in table 3124 ( 3284 ). The recorded location is the location closest to the location indicated by the pointer P.

Wenn beim Schritt 3271 ein Wort an der vorhergehenden Stelle vorhanden ist, wenn der durch den Zeiger p angezeigte Eingang nicht "und" ist (3272), und der Zeiger p nicht den Zahlenwert anzeigt (3273), wird der Zahlenwert für die laufende Wörterbuch-Bezugseinheit in der Tabelle 3124 aufgezeichnet, welcher der laufenden, durch den Zeiger p angezeigten Position am nächsten ist (3284). In dem Beispiel "To him two . . ." ("zu ihm zwei . . ." ist "zwei" kürzlich als ein Zahlenwert "2" aufgezeichnet.If at step 3271 there is a word at the previous position when the input indicated by pointer p is not "and" ( 3272 ) and pointer p does not indicate the numerical value ( 3273 ), the numerical value for the current dictionary is Reference unit recorded in table 3124 which is closest to the current position indicated by pointer p ( 3284 ). In the example "To him two..."("To him two..." is "two" recently recorded as a numerical value "2".

Wenn beim Schritt 3273 der Zeiger p einen Zahlenwert anzeigt, wird der Zahlenwert p → v für den durch den Zeiger p angezeigten Eingang mit dem numerischen Wert v-now für die aktuelle Wörterbuch-Bezugseinheit multipliziert, um einen neuen numerischen Wert p → v für den durch den Zeiger p angezeigten Eingang zu bilden (3274). Im Falle von "zweitausend" wird beispielsweise "2 × 1000=2000" durchgeführt, um das gesamte "zweitausend" zu einem Ausdruck zusammenzusetzen. Dann wird die Endposition für die aktuelle Wörterbuch-Bezugseinheit als die Endposition für den Eingang des Zeigers p, die p-end-Position gesetzt (3282).If at step 3273 the pointer p indicates a numerical value, the numerical value p → v for the input indicated by the pointer p is multiplied by the numerical value v -now for the current dictionary reference unit by a new numerical value p → v for the to form the input indicated by pointer p ( 3274 ). In the case of "two thousand", for example, "2 × 1000 = 2000" is carried out in order to combine the entire "two thousand" into one printout. Then the end position for the current dictionary reference unit is set as the end position for the input of the pointer p, the p -end position ( 3282 ).

Wenn beim Schritt 3272 der durch den Zeiger p angezeigte Eingang "und" ist, wird der Zeiger p davor an die Wörterbuch- Abrufeinheit übertragen (3275). Wenn er sich nicht an der Endstelle befindet (3276) und wenn es ein Zahlenwert ist (3277), wird der numerische Wert v-now der gegenwärtigen Wörterbuch-Bezugseinheit übertragen und an der höchstwertigen Ziffer abgerundet, welche dann als ein Wert vl gesetzt wird. Wenn der numerische Wert v-now der aktuellen Bezugseinheit beispielsweise "8", "8.1", "98" oder "11" ist, ist der Wert vl "10", "10", "100" bzw. "100".At step 3272, if the input indicated by pointer p is "and", pointer p before that is transmitted to the dictionary retriever ( 3275 ). If it is not at the end point ( 3276 ) and if it is a numerical value ( 3277 ), the numerical value v -now is transmitted to the current dictionary reference unit and rounded off to the most significant digit, which is then set as a value vl . If the numerical value v -now of the current reference unit is "8", "8.1", "98" or "11", for example, the value vl is "10", "10", "100" or "100".

Dann wird geprüft, ob der Überschuß, der durch Dividieren des numerischen Werts p-v für den durch den Zeiger p angezeigten Eingang durch vl erhalten worden ist, d. h. mod (p-v, vl) "0" ist oder nicht. Wenn er nicht "0" ist, wird der Zeiger p inkrementiert (3283) und der Zahlenwert für die aktuelle Bezugseinheit wird an einer Stelle, welcher der durch den aktuellen Zeiger p angezeigten Stelle am nächsten ist, in der Tabelle 3124 aufgezeichnet (3284). Im Falle von "I and two" (Ich und zwei) beispielsweise ist "zwei" kürzlich als ein Zahlenwert "2" aufgezeichnet.It is then checked whether the excess obtained by dividing the numerical value p - v for the input indicated by the pointer p by vl , that is, mod (p - v, vl) is "0" or not. If it is not "0", pointer p is incremented ( 3283 ) and the numerical value for the current reference unit is recorded in table 3124 at a position which is closest to the position indicated by the current pointer p ( 3284 ). For example, in the case of "I and two", "two" is recently recorded as a numerical value "2".

Wenn der Überschuß beim Schritt 3279 "0" ist, wird der numerische Wert v-now für die aktuelle Bezugseinheit zu dem numerischen Wert p → v für den durch den Zeiger p angezeigten Eingang addiert, um einen neuen numerischen Wert p → v für den durch den Zeiger p angezeigten Eingang zu bilden (3280). Im Falle von "zweitausend und zwei" sind beispielsweise "zweitausend" in dieser Stufe bereits zu "2000" zusammengesetzt. Dann wird es mit "2" von "zwei" durch die Addition 3200 addiert, um so kollektiv den Gesamtteil zu "2002" zusammenzusetzen. Dann wird die Information "und", welche durch den Zeiger p + 1 angezeigt ist, aus der Tabelle 3124 gelöscht (3281) und es wird auf Schritt 3282 übergegangen. Nun wird Abrufen an einem Beispiel erläutert. Wenn ein Wörterbuch-Abrufen beispielsweise bei der eingegebenen Zeichenfolge "Zu ihm zwei tausend und zweiundzwanzig. . ." (To him two thousand and twenty two. . ."), durchgeführt wird, in Fig. 25 dargestellt ist, wird die Wörterbuch-Eingangsinformation in die Tabelle 3124 geschrieben, wie in Fig. 26A dargestellt ist. Für "ihm" ("him") beispielsweise ist die Startposition "4", die Endposition ist "6" und der Sprachteil ist ein Pronomen. Für die numerische Verarbeitung wird zuerst für "zwei" beurteilt, daß das numerische Flag "1" ist (3205) und der Zahlenwert hierfür "2" ist. Da das vorhergehende Zeichen zu "2" in dieser Zeichenreihe kein Zahlenwert ist, wird er unmittelbar in der Tabelle 3124 gespeichert (3206, 3208, 3284).If the excess at step 3279 is "0", the numerical value v -now for the current reference unit is added to the numerical value p → v for the input indicated by the pointer p by a new numerical value p → v for the through to form the pointer p indicated input ( 3280 ). In the case of "two thousand and two", for example, "two thousand" are already combined to "2000" at this level. Then it is added with "2" of "two" by the addition 3200 so as to collectively assemble the whole part into "2002". Then the information "and", which is indicated by the pointer p + 1, is deleted from the table 3124 ( 3281 ) and the process proceeds to step 3282 . Retrieval is now explained using an example. For example, if a dictionary retrieve the string "To him two thousand and twenty-two ..." (To him two thousand and twenty two... ") Is performed, as shown in Fig. 25, the dictionary input information is written into the table 3124 as shown in Fig. 26A. For" him "(" him ") For example, the start position is" 4 ", the end position is" 6 "and the speech part is a pronoun. For numerical processing, it is first judged for" two "that the numerical flag is" 1 "( 3205 ) and the numerical value therefor "2." Since the preceding character to "2" in this series of characters is not a numerical value, it is immediately stored in table 3124 ( 3206, 3208, 3284 ).

Dann wird der Zeiger inkrementiert, um zur Verarbeitung von "tausend" überzugehen. Das numerische Flag ist "1", und der Zahlenwert ist "tausend" (3205, 3206). Da außerdem der Zahlenwert des vorherigen Zeichens "2" ist (3207, 3273), wird die Multiplikation: 2 × 1000 durchgeführt (3274), und dessen Ergebnis wird in der Tabelle 3124 gespeichert (siehe Fig. 26B). Für das nächste "und" wird dann die Wörterbuchinformation vorübergehend, sowie sie ist, in der Tabelle 3124 gespeichert (siehe Fig. 26C).Then the pointer is incremented to proceed to processing "thousand". The numeric flag is "1" and the numerical value is "thousand" ( 3205, 3206 ). In addition, since the numerical value of the previous character is "2" ( 3207, 3273 ), the multiplication: 2 × 1000 is performed ( 3274 ), and its result is stored in the table 3124 (see Fig. 26B). For the next "and", the dictionary information is temporarily stored as it is in the table 3124 (see Fig. 26C).

Der Zeiger wird dann vorgerückt, um "zwanzig-zwei" zu verarbeiten. Da die Worte ein mit Bindestrich versehenes Wort sind, das nicht im Wörterbuch gefunden wird (3212), wird "20 + 2=22" bei der Bearbeitung 3213 für die mit Bindestrichen versehenen Zahlen durchgeführt (3237, 3239 bis 3241). Da das vorhergehende Wort "und" ist (3272) und der dort vorhergehende Zahlenwert "2000" ist (3277), wird der Zahlenwert "22" an der höchstwertigen Stelle auf "100" abgerundet (3278) und eine Teilungsoperation 3279 durchgeführt. Da der Überschuß "0" ist, wird eine Addition "3280" zwischen "2000" und "22" durchgeführt. Die Information für "und" wird aus der Tabelle 3124 beseitigt (3282), und das Ergebnis der Addition (2022) wird als ein Zahlenwert in der Tabelle 3124 aufbewahrt, wordurch "zweitausend und zwanzig- zwei" als "2022" erkannt wird. Folglich ist eine kollektive Zusammensetzung mit dem vorhergehenden Zahlenwert durchgeführt worden (3207). Nunmehr wird ein weiteres Beispiel dargestellt. Wie in Fig. 27 dargestellt, wird ein sogenanntes Parsing für die eingegebene Zeichenfolge "sie sagt $1,000.5 tausend war . . ." (You said $1,000.5 thousand was . . .). "$1,000.5" war nicht in dem Wörterbuch 3018 registriert. Das erste Zeichen ist das Währungssymbol "$", welches als das Währungssymbol aus dem Wörterbucheingang erkannt werden kann. Dies ist unabhängig in der Tabelle 3124 aufgezeichnet (3214, 3216, Fig. 29A).The pointer is then advanced to process "twenty-two". Since the words are a hyphenated word that is not found in the dictionary ( 3212 ), "20 + 2 = 22" is performed in processing 3213 for the hyphenated numbers ( 3237, 3239 to 3241 ). Since the preceding word is "and" ( 3272 ) and the preceding numerical value is "2000" ( 3277 ), the numerical value "22" is rounded down to "100" at the most significant position ( 3278 ) and a division operation 3279 is carried out. Since the excess is "0", an addition "3280" between "2000" and "22" is performed. The information for "and" is removed from table 3124 ( 3282 ) and the result of the addition ( 2022 ) is stored as a numerical value in table 3124 , whereby "two thousand and twenty-two" is recognized as "2022". As a result, a collective composition with the previous numerical value has been performed ( 3207 ). Another example will now be presented. As shown in Fig. 27, so-called parsing for the entered string "it said $ 1,000.5 thousand was ..." (You said $ 1,000.5 thousand was...). "$ 1,000.5" was not registered in the 3018 dictionary. The first character is the currency symbol "$", which can be recognized as the currency symbol from the dictionary input. This is independently recorded in table 3124 ( 3214, 3216, Fig. 29A).

Dann wird "1,0005." in einen Zahlenwert "1000.5" durch die nachfolgende numerische Zeichenverarbeitung 3215 gebildet. Da das vorhergehende Zeichen das Symbol "$" und nicht ein Zahlenwert ist, wird der Zahlenwert an sich aufgezeichnet (3280 bis 3273, 3284, Fig. 29B).Then "1,0005." into a numerical value "1000.5" by the subsequent numerical character processing 3215 . Since the preceding character is the symbol "$" and not a numerical value, the numerical value itself is recorded ( 3280 to 3273, 3284, Fig. 29B).

Das nächste Wort "thousand" ist eine Zahl, und dessen Zahlenwert ist "1000". Da das vorhergehende Zeichen ein Zahlenwert ist (3272, 3273), wird eine Berechnung durchgeführt: "1000.5 × 1000=1000500" (3274, Fig. 29C). Nachdem das Wörterbuch- Abrufen beendet worden ist, wird der in der Tabelle 3174 aufbewahrte Inhalt nacheinander überprüft. Da das Währungssymbol "$" unmittelbar vor dem Zahlenwert "1000500" vorhanden ist, werden beide kollektiv zusammengesetzt, und es wird "$1000500" als eine einzige Wortgabe gebildet (3209, 3221 bis 3223, Fig. 29D).The next word "thousand" is a number and its numerical value is "1000". Since the preceding character is a numerical value ( 3272, 3273 ), a calculation is made: "1000.5 × 1000 = 1000500" ( 3274 , Fig. 29C). After dictionary retrieval is finished, the content kept in table 3174 is checked one by one. Since the currency symbol "$" is immediately before the numerical value "1000500", both are put together collectively, and "$ 1000500" is formed as a single phrase ( 3209, 3221 to 3223 , Fig. 29D).

Nunmehr wird die vierte Ausführungsform gemäß der Erfindung erläutert. In Fig. 30 ist die vierte Ausführungsform des Sprachanalysators gemäß der Erfindung dargestellt, der bei einer automatischen Übersetzungseinrichtung für Englisch- Japanisch angewendet ist.The fourth embodiment according to the invention will now be explained. Fig. 30 shows the fourth embodiment of the speech analyzer according to the invention, which is applied to an automatic translation device for English-Japanese.

Diese Ausführungsform hat einen Eingabe-Verarbeitungsabschnitt 4014, in welchen Daten von einer Eingabeeinrichtung 4012 eingegeben sind. Die Eingabeeinrichtung 4012 weist beispielsweise ein Tastenfeld mit Zeichentasten, wie alphanumerische und Funktions-Tasten, eine optische Zeichenaufzeichnungseinrichtung zum Lesen eines auf Papier aufgezeichneten englischen Textes und einen Leser, wie eine Magnetplatte auf.This embodiment has an input processing section 4014 in which data is input from an input device 4012 . The input device 4012 has, for example, a keypad with character keys, such as alphanumeric and function keys, an optical character recorder for reading English text recorded on paper, and a reader, such as a magnetic disk.

Der Eingabeverarbeitungsabschnitt 4014 hat einen Puffer 4014 a für eine eingegebene Zeichenreihe und speichert den von der Einrichtung 4012 eingegebenen englischen Satz in dem Puffer 4014 a. Der Abschnitt 4014 liest einen in dem Puffer 4014 a gespeicherten eingegebenen Satz und gibt ihn an den Einheiten-Aufteilabschnitt 4016 ab. Der Abschnitt 4016 ist ein Funktionsabschnitt, welcher die Wörterbuch-Bezugseinheit des von dem Abschnitt 4014 eingegebenen Satzes aufteilt. In einer Abgrenzungstabelle 4018 sind Abgrenzungen, wie Zwischenraum und Komma, gespeichert. Der Abschnitt 4016 speichert Abgrenzungen aus der Abgrenzungstabelle 4018 und teilt den eingegebenen Satz von dem Abschnitt 4014 in Zeichenreihen als die Einheiten für den Fall eines Abrufens eines Bezugsspeichers 4020, indem der Satz an den Stellen geteilt wird, wo die Abgenzungen vorhanden sind. Die aufgeteilten Zeichenreihen werden in den Wörterbuch-Abrufabschnitt 4022 eingegeben.The input processing section 4014 has a buffer 4014 a for an input character string and stores the English sentence entered by the device 4012 in the buffer 4014 a . The section 4014 reads an input sentence stored in the buffer 4014 a and delivers it to the unit division section 4016 . Section 4016 is a functional section that divides the dictionary reference unit of the sentence entered by section 4014 . A delimitation table 4018 stores delimitations, such as space and comma. Section 4016 stores delimitations from delimitation table 4018 and divides the input sentence from section 4014 into character strings as the units in the case of retrieving a reference memory 4020 by dividing the sentence where the deductions exist. The divided character strings are input to the dictionary retrieving section 4022 .

Der Abschnitt 4022 ruft das Referenz-Wörterbuch 4020 für den eingegebenen Satz ab, der in Bezugseinheiten aufgeteilt ist, welche von dem Abschnitt 4016 abgegeben worden sind. Das Referenzwörterbuch 4020 speichert Eingänge beispielsweise für die englischen Zeichenfolgen, Sprachteile davon, Merkmalsinformationen, usw. wie in Fig. 31 dargestellt ist. Das Wörterbuch 4020 speichert zusätzlich beispielsweise in der Figur dargestellte Eigennamen, Zeichenfolgen, für einen anderen Sprachteil, Verben und Adjektiva. Die Aufzeichnung für den Eigennamen als Sprachteil in der Fig. bedeutet, daß dies bei der Verarbeitung für den registrierten Eigennamen angewendet wird, was später noch beschrieben wird, aber daß keine üblichen grammatikalischen Eigennamen ausgedrückt werden. Ferner zeigen die Merkmalsinformationen das an, was durch den Eigennamen dargestellt ist, was nicht immer nur auf einen beschränkt sein kann, da, wie später beschrieben wird, ein Eigennamen in Abhängigkeit von dem Anwendungsfall einer Anzahl Merkmale darstellt.Section 4022 retrieves the reference dictionary 4020 for the input sentence, which is divided into reference units provided by section 4016 . The reference dictionary 4020 stores inputs for, for example, the English strings, parts of speech thereof, feature information, etc. as shown in FIG. 31. The dictionary 4020 additionally stores, for example, proper names, character strings shown in the figure, for another language part, verbs and adjectives. The record for the proper name as a language part in the figure means that this is applied in the processing for the registered proper name, which will be described later, but that no common grammatical proper names are expressed. Furthermore, the feature information indicates what is represented by the proper name, which cannot always be limited to one, since, as will be described later, a proper name represents a number of features depending on the application.

Der Abschnitt 4022 ruft das Wörterbuch 4022 für die in Bezugseinheiten aufgeteilte Zeichenfolge ab, und wenn die Zeichenfolge ein Eigenname ist, wird er an den Eigennamen- Verarbeitungsabschnitt 4024 abgegeben, wo eine Verarbeitung für den Eigennamen durchgeführt wird, was später noch beschrieben wird. Wenn es kein Eigenname ist, wird dies an den Verarbeitungsabschnitt 4036 abgegeben und in dessen Tabelle 4036 a aufbewahrt.Section 4022 retrieves dictionary 4022 for the string divided into reference units, and if the string is a proper name, it is passed to proper name processing section 4024 , where processing for the proper name is performed, which will be described later. If it is not a proper name, this is given to the processing section 4036 and stored in its table 4036 a .

Der Eigennamen-Verarbeitungsabschnitt 4024 weist einen Verarbeitungsabschnitt für das vorhergehende Satzende, einen Verarbeitungsabschnitt 4028 für einen vorhergehenden Eigennamen, einen Verarbeitungsabschnitt 4030 für den Eigennamen an sich, einen Verarbeitungsabschnitt 4032 für den vorhergehenden Eigennamen und den Eigennamen an sich und einen Abschnitt 4034 für eine Vorgabe- oder Standard-Merkmalsinformation auf. Der Abschnitt 4026 beurteilt, ob die Zeichenfolge, die der Zeichenfolge vorangeht, die durch den Abschnitt 4022 abgerufen und von dem Abschnitt 4022 aus eingegeben worden ist, sich am Ende des Satzes befindet oder nicht, wenn die vorhergehende Zeichenfolge sich am Ende des Satzes befindet, setzt sie den Großbuchstaben am Anfang der zu verarbeitenden Zeichenfolge in einen kleinen Buchstaben um, gibt an den Wörterbuch-Abrufeabschnitt 4022 ab und bewirkt, daß der Abschnitt 4022 das Referenzwörterbuch 4020 wieder abruft. Die Zeichenfolge die sogar durch den zweiten Abrufvorgang nicht abberufen worden ist, wird als ein nicht- registrierter Eigenname beurteilt, was dann an den Verarbeitungsabschnitt 4036 gesendet und dann in der Tabelle 4036 a gespeichert wird. Wenn sich ferner die vorgehende Zeichenfolge nicht am Ende des Satzes befindet, wird sie in dem Verarbeitungsabschnitt 4036 als ein Eigenname abgegeben, dessen Merkmalsinformation unbekannt ist und wird in der Tabelle 4036 a registriert, wie später noch beschrieben wird.The proper name processing section 4024 has a processing section for the previous sentence end, a processing section 4028 for a previous proper name, a processing section 4030 for the proper name per se, a processing section 4032 for the previous proper name and the proper name per se and a section 4034 for a default or standard feature information. Section 4026 judges whether or not the string preceding the string retrieved by section 4022 and entered from section 4022 is at the end of the sentence if the previous string is at the end of the sentence. it converts the uppercase letter to a lower case at the beginning of the character string to be processed, passes it to the dictionary retrieving section 4022 , and causes section 4022 to retrieve the reference dictionary 4020 again. The character string that was not retrieved even by the second retrieval process is judged to be an unregistered proper name, which is then sent to the processing section 4036 and then stored in the table 4036 a . Furthermore, if the preceding character string is not at the end of the sentence, it is given in the processing section 4036 as a proper name whose feature information is unknown and is registered in the table 4036 a , as will be described later.

Der Abschnitt 4028 zergliedert die Merkmalsinformation für die vorhergehende Zeichenfolge von dem Abschnitt 4026 und gibt das Ergebnis an den Verarbeitungsabschnitt 4030 für den Eigennamen an sich ab. Der Abschnitt 4030 prüft die Merkmalsinformation des zu zergliedernden Eigennamens, und wie später beschrieben wird, wenn die Merkmalsinformation entweder den Eigennamen oder den vorhergehenden Eigennamen nicht registriert ist, zergliedert sie den Eigennamen und den vorhergehenden Eigennamen kollektiv durch die registrierte Information für den anderen von ihnen und bewahrt ihn in der Tabelle 4036 a im Abschnitt 4036 auf.Section 4028 dissects the feature information for the previous string from section 4026 and delivers the result to the proper name processing section 4030 per se. The section 4030 checks the feature information of the proper name to be broken down, and as will be described later, when the feature information of either the proper name or the previous proper name is not registered, it decomposes the proper name and the previous proper name collectively by the registered information for the other of them and keep it in table 4036 a in section 4036 .

Der Verarbeitungsabschnitt 4032 für den vorhergehenden Eigennamen und den Eigennamen an sich überprüft den Teil, welcher der Merkmalsinformation für den Eigennamen und für den vorhergehenden zu zergliedernden Eigennamen gemeinsam ist, zergliedert diese Eigennamen bezüglich des gemeinsamen Teils, gibt das Ergebnis an den Verarbeitungsabschnitt 4036 ab und speichert ihn in dessen Tabelle 4036 a. Der Abschnitt 4036 schafft eine Merkmalsinformation zu einem Eigennamen, nachdem er aus der Tabelle 4036 a ausgelesen ist, über den Abschnitt 4016 an den Abschnitt 4022 gesendet und als Ergebnis des Abfragens des Wörterbuchs 4020 in dem Abschnitt 4022 herausgefunden worden ist, daß keine Merkmalsinformation vorliegt. Da es effektiv ist, daß ein Eignennamen in Abhängigkeit von den Anwendungsfällen verschiedene Arten von Merkmalen hat, sind alle Merkmalsinformationen vorgesehen, die notwendigerweise in Betracht zu ziehen sind. Beispielsweise sind "Person, Ort, Gruppe u. ä." vorgesehen. Nachdem der Eigenname mit der Merkmalsinformation versehen ist, sendet der Abschnitt 4034 die Daten an den Verarbeitungsabschnitt 4036 und speichert sie in dessen Tabelle 4036 a.The previous proper name and the proper name processing section 4032 itself checks the part that is common to the characteristic name information for the proper name and for the previous proper name to be decomposed, decomposes these proper names with respect to the common part, outputs the result to the processing section 4036 , and stores him in his table 4036 a . Section 4036 creates feature information on a proper name after it is read from table 4036 a , sent to section 4022 via section 4016 and, as a result of querying dictionary 4020 , section 4022 has found that there is no feature information. Since it is effective that a proper name has different types of features depending on the use cases, all feature information is provided which must necessarily be considered. For example, "person, place, group, etc." intended. After the proper name is provided with the feature information, the section 4034 sends the data to the processing section 4036 and stores it in its table 4036 a .

Der Abschnitt 4036 mit der Tabelle 4036 a speichert die Daten von dem Abschnitt 4032 für den vorhergehenden Eigennamen und den Eigennamen an sich, den Abschnitt 4034 und den Abrufabschnitt 4022 in die Wörterbuch-Informationskonservierungstabelle 4036 a und liest danach die gespeicherten Daten aus und gibt sie an den Parsing-Abschnitt 4038 ab. Der Abschnitt 4038 führt die Analyse für den eingegebenen Satz durch, nachdem er einer Morphem-Analyse unterworfen worden ist und er aus der Tabelle 4036 a ausgelesen ist.The section 4036 with the table 4036 a stores the data from the section 4032 for the previous proper name and the proper name per se, the section 4034 and the retrieving section 4022 in the dictionary information preservation table 4036 a and then reads out and specifies the stored data parsing section 4038 . Section 4038 carries out the analysis for the input sentence after it has been subjected to a morpheme analysis and it has been read from table 4036 a .

Die Arbeitsweise dieser Einrichtung wird nunmehr anhand des Flußdiagramms in Fig. 32 erläutert. Zuerst wird ein von der Einrichtung 4012 eingegebener englischer Satz in den Eingabe-Verarbeitungsabschnitt 4014 gelesen (4100). Der in den Abschnitt 4014 eingelesene Satz wird in den Puffer 4014 a geladen, der in den Puffer 4014 a geladene, eingegebene Satz wird in den Einheiten-Aufteilabschnitt 4016 ausgelesen. Wenn der Satz eingegeben ist, liest der Abschnitt 4016 aus der Abgrenzungstabelle 4018 Abgrenzungen, um die Wörterbuch- Bezugseinheiten aufzuteilen (4102). Das heißt, Zeichen folgen, welche den eingegebenen Satz darstellen, werden nacheinander von der Oberseite bzw. dem Anfang der Zeichenfolgen aus in Abruf-Zeichenreihen als die Einheit zum Abrufen des Bezugswörterbuchs 4020 aufgeteilt, indem sie in die Teile aufgeteilt werden, wo Abgrenzungen, wie ein Zwischenraum oder ein Doppelpunkt vorhanden sind. Der Abschnitt beurteilt, ob die aufgeteilte Wörterbuch-Bezugseinheit, d. h. Abrufzeichenfolgen beendet sind oder nicht (4104), und wenn noch (nicht beendete) Abruf-Zeichenfolgen vorhanden sind, gibt er die Abruf-Zeichenfolge an den Abschnitt 4022 ab. The operation of this device will now be explained with reference to the flow chart in Fig. 32. First, an English sentence input from the device 4012 is read into the input processing section 4014 ( 4100 ). The read section 4014 in the set is read into the buffer 4014 a load, which is in the buffer 4014 a charged inputted sentence in the unit Aufteilabschnitt 4016th When the sentence is entered, section 4016 reads delimitation table 4018 to divide the dictionary reference units ( 4102 ). That is, to follow characters representing the input sentence are sequentially divided into fetch strings from the top of the strings as the unit for fetching the reference dictionary 4020 by dividing them into the parts where delimitations such as there is a space or a colon. The section judges whether the divided dictionary reference unit, that is, fetch strings are finished or not ( 4104 ), and if fetch strings still exist ( unfinished ), it passes the fetch string to section 4022 .

Wenn eine Abruf-Zeichenfolge an den Abschnitt 4022 abgegeben ist, ruft der Abschnitt 4022 das Bezugswörterbuch 4022 für die Abruf-Zeichenfolge ab (4106). Der Abschnitt beurteilt, ob die Abruf-Zeichenfolge in dem Eingang des Bezugswörterbuchs 4020 vorhanden ist oder nicht, wie in Fig. 31 dargestellt ist (4108), und wenn ein Eingang vorhanden ist, liest er die in dem Wörterbuch 4020 gespeicherte Sprachteilinformation aus und beurteilt, ob die Abruf-Zeichenfolge ein Eigenname ist oder nicht (4110). Wenn die Abruf-Zeichenfolge kein Eigenname ist, gibt der Abschnitt 4022 die aus dem Wörterbuch 4020 ausgelesenen Daten an den Verarbeitungsabschnitt 4036 ab und zeichnet sie in dessen Tabelle 4036 a auf (4112). Wenn die Daten in der Tabelle 4036 a gespeichert sind, werden der Eingang, der anzeigt, daß die Daten in dem Abschnitt 4016 gespeichert sind, und die Daten für die gerade vorher aufbewahrte Abruf-Zeichenfolge von dem Verarbeitungsabschnitt 4036 aus eingegeben. Dann wird auf den Schritt 4102 zurückgekehrt und die Wörterbuch-Bezugseinheit wird wieder in dem Abschnitt 4016 aufgeteilt.If a fetch string is provided to section 4022 , section 4022 fetches the reference dictionary 4022 for the fetch string ( 4106 ). The section judges whether the retrieval character string in the input of the reference dictionary is present 4020 or is not shown in Fig. 31 (4108), and if an input exists, it reads out the information stored in the dictionary 4020 part of speech information and assessed whether the retrieval string is a proper name or not ( 4110 ). If the retrieval string is not a proper name, the section 4022 delivers the data read from the dictionary 4020 to the processing section 4036 and records it in its table 4036 a ( 4112 ). When the data is stored in the table 4036 a , the input indicating that the data is stored in the section 4016 and the data for the retrieved string just saved are input from the processing section 4036 . Then, step 4102 is returned to and the dictionary reference unit is again divided into section 4016 .

Wenn beim Schritt 4110 die Abruf-Zeichenfolge ein Eigenname ist, gibt der Abschnitt 4022 die Daten für den aus dem Wörterbuch 4020 gelesenen Eigennamen, was nachstehend der Einfachheit halber als Eigenname bezeichnet wird, an den Verarbeitungsabschnitt 4024 für Eigennamen zusammen mit den Daten für die vorhergehende Abruf-Datenfolge, die von der Tabelle 4036 a in dem Abschnitt 4036 eingegeben worden ist, mittels des Abschnitts 4016 an den Abschnitt 4022 ab, und die Verarbeitung für den im Wörterbuch registrierten Eigennamen wird in dem Abschnitt 4024 durchgeführt (4124).If, in step 4110, the retrieval string is a proper name, section 4022 passes the data for the proper name read from dictionary 4020 , hereinafter referred to simply as proper name, to proper name processing section 4024 along with the data for the previous one polling data sequence which has been inputted from the table 4036 a in the portion 4036, by means of the section 4016 to the section 4022 decreases and the processing for the registered in the dictionary proper names is performed in the section 4024 (4124).

Nunmehr wird die Verarbeitung für den im Wörterbuch registrierten Eigennamen anhand des Flußdiagramms in Fig. 33 erläutert. Die Daten, welche von dem Abschnitt 4022 an den Abschnitt 4024 abgegeben worden sind, werden mittels des Verarbeitungsabschnitts 4026 für das vorhergehende Satzende in den Verarbeitungsabschnitt 4028 für den vorhergehenden Eigennamen eingegeben. Bei der Verarbeitung des im Wörterbuch registrierten Eigennamen hat der Abschnitt 4026 keine Funktion. Der Verarbeitungsabschnitt 4028 beurteilt, ob die Eingabe, welche dem Eigennamen vorangeht, ein nicht in dem Wörterbuch 4020 registrierter Eigennamen ist oder nicht, d. h. ob er bei der Verarbeitung für den nicht im Wörterbuch registrierten Eigennamen, was später noch beschrieben wird, verwendet wird oder nicht (2100). Wenn es ein nichtregistrierter Eigennamen ist, beurteilt der Verarbeitungsabschnitt den Gesamtteil des Eigennamens und den vorhergehenden, nicht registrierten Eigennamen als einen Eigennamen, welche die Merkmalsinformation für den Eigennamen hat (4202) gibt die Daten an den Verarbeitungsabschnitt 4036 ab und speichert sie in dessen Tabelle 4036 a (4218).The processing for the proper name registered in the dictionary will now be explained with reference to the flow chart in FIG. 33. The data output from section 4022 to section 4024 is input to processing section 4028 for the previous proper name by processing section 4026 for the previous sentence end. Section 4026 has no function when processing the proper name registered in the dictionary. The processing section 4028 judges whether or not the input that precedes the proper name is a proper name not registered in the dictionary 4020 , that is, whether it is used in the processing for the proper name not registered in the dictionary, which will be described later, or not ( 2100 ). If it is an unregistered proper name, the processing section judges the whole part of the proper name and the previous unregistered proper name as a proper name having the characteristic information for the proper name ( 4202 ), and outputs the data to the processing section 4036 and stores it in its table 4036 a ( 4218 ).

Wenn der Abschnitt 4028 beim Schritt 4200 beurteilt, daß die dem Eigennamen vorangehende Eingabe nicht ein registrierter Eigennamen ist beurteilt sie anschließend, ob der dem Eigennamen vorangehende Eingang ein Eigenname ist, der in dem Wörterbuch 4020 registriert oder nicht (4204). Wenn der dem Eigenname gerade vorangehende Eingang ein registrierter Eigenname ist, wird beurteilt, ob die Merkmalsinformation für den vorhergehenden Eigennamen bekannt ist oder nicht, d. h. ob er nicht in dem Wörterbuch 4022 registriert ist, oder nicht (4206).Subsequently, if the section 4028 judges at step 4200 that the input before the proper name is not a registered proper name, then it judges whether the input before the proper name is a proper name that registers in the dictionary 4020 or not ( 4204 ). If the input just before the proper name is a registered proper name, it is judged whether or not the feature information for the previous proper name is known, that is, whether it is not registered in the dictionary 4022 ( 4206 ).

Wenn der Merkmalsinformation für den vorhergehenden Eigennamen unbekannt ist, wird auf den Schritt 4202 übergegangen, um den gesamten Anteil des Eigennamens an sich und den diesem unmittelbar vorangehenden Eigennamen als einen Eigennamen mit einer Merkmalsinformation zu beurteilen (4202); der Verarbeitungsabschnitt 4028 für den vorhergehenden Eigennamen gibt dann die Daten an den Verarbeitungsabschnitt 4036 ab, in dessen Tabelle 4036 a dann die abgegebenen Daten aufgezeichnet werden (4218).If the feature information for the previous proper name is unknown, the process proceeds to step 4202 to judge the entire portion of the proper name itself and the proper name preceding it as a proper name with feature information ( 4202 ); the processing section 4028 for the previous proper name then outputs the data to the processing section 4036 , in whose table 4036 a the output data are then recorded ( 4218 ).

Wenn der Verarbeitungsabschnitt 4028 beim Schritt 4206 beurteilt, daß die Merkmalsinformation für den vorhergehenden Eigennamen nicht unbekannt ist, d. h. daß er in dem Wörterbuch 4020 registriert ist, werden die Daten von dem Abschnitt 4028 an den Verarbeitungsabschnitt 4030 abgegeben. Der Abschnitt 4030 beurteilt dann, ob die Merkmalsinformation des Eigennamens unbekannt ist oder nicht (4208). Wenn die Merkmalsinformation für den Eigennamen unbekannt ist, beurteilt der Abschnitt 4030 den den gesamten Anteil des Eigennamens an sich und den diesem unmittelbar vorhergehenden Eigennamen als einen Eigennamen, der eine Merkmalsinformation des Eigennamens hat (4210) und gibt die Daten an dem Verarbeitungsabschnitt 4036 ab, in dessen Tabelle 4036 a die Daten aufgezeichnet werden.If the processing section 4028 judges at step 4206 that the feature information for the previous proper name is not unknown, that is, that it is registered in the dictionary 4020 , the data is supplied from the section 4028 to the processing section 4030 . Section 4030 then judges whether or not the feature name information is unknown ( 4208 ). If the feature information for the proper name is unknown, the section 4030 judges the entire portion of the proper name per se and the immediately preceding proper name as a proper name that has feature information of the proper name ( 4210 ) and outputs the data to the processing section 4036 , in the table 4036 a the data are recorded.

Wenn der Verarbeitungsabschnitt 4030 feststellt, daß die Merkmalsinformation des Eigennames an sich nicht unbekannt ist, das heißt, daß er in dem Wörterbuch 4020 registriert ist, gibt der Abschnitt 4030 die Daten an den Verarbeitungsabschnitt 4032 für den vorhergehenden Eigennamen und den Eigennamen an sich ab. Der Verarbeitungsabschnitt 3032 beurteilt dann, ob irgendein gemeinsames Merkmal in der Merkmalsinformation zwischen dem Eigennamen an sich und dem diesem unmittelbar vorangehenden Eigennamen vorhanden ist (4212). Wenn ein gemeinsames Merkmal vorhanden ist, beurteilt er den gesamten Anteil für den Eigennamen an sich und für den unmittelbar vorangehenden Eigennamen als einen Eigennamen, der die Merkmalsinformationn des gemeinsamen Teils hat (4214), und gibt die Daten an dem Verarbeitungsschritt 4036 ab, in dessen Tabelle 4036 a die Daten dann aufgezeichnet werden (4218).If the processing section 4030 determines that the characteristic information of the proper name per se is not unknown, that is, that it is registered in the dictionary 4020 , the section 4030 outputs the data to the processing section 4032 for the previous proper name and the proper name itself. The processing section 3032 then judges whether there is any common feature in the feature information between the proper name per se and the proper preceding name ( 4212 ). If there is a common feature, it judges the entire portion for the proper name per se and for the immediately preceding proper name as a proper name that has the feature information of the common part ( 4214 ) and outputs the data to processing step 4036 , in which Table 4036 a the data is then recorded ( 4218 ).

Wenn es kein gemeinsames Merkmal für die Merkmalsinformation zwischen dem Eigennamen an sich und dem diesem unmittelbar vorangehenden Eigennamen gibt, beurteilt der Verarbeitungsabschnitt, daß der Eigennamen ein Eigennamen mit der Merkmalsinformation ist, welche aus dem Speicher 4020 abgerufen worden ist, die sich von dem unmittelbar vorangehenden Eigennamen unterscheidet (4216), und gibt die Daten an dem Verarbeitungsabschnitt 4036 ab, in dessen Tabelle 4036 a die Daten dann aufgezeichnet werden (4218). Wenn in Fig. 32 beim Schritt 4108 keine Abrufzeichenfolge in dem Eingang des Bezugswörterbuchs 4020 vorhanden ist, wird beurteilt, ob das erste Zeichen der Abruf-Zeichenfolge ein großgeschriebener Buchstabe ist oder nicht, und wenn es kein großgeschriebener Buchstaben ist, betrachtet der Abschnitt 4022 die Abrufzeichenfolge als ein nicht registriertes Wort, gibt sie an den Verarbeitungsabschnitt 4036 ab, damit sie in dessen Tabelle 4036 a aufgezeichnet wird (4118).If there is no common feature for the feature information between the proper name itself and the immediately preceding proper name, the processing section judges that the proper name is a proper name with the feature information retrieved from the memory 4020 that is different from the immediately preceding one Proper names distinguishes ( 4216 ), and delivers the data to the processing section 4036 , in whose table 4036 a the data are then recorded ( 4218 ). In Fig. 32, if there is no fetch string in the input of the reference dictionary 4020 at step 4108 , it is judged whether or not the first character of the fetch string is an uppercase letter, and if it is not an uppercase letter, section 4022 considers that Fetch string as an unregistered word, outputs it to the processing section 4036 so that it is recorded in its table 4036 a ( 4118 ).

Wenn das erste Zeichen ein großgeschriebener Buchstabe ist, werden die Daten für die Abruf-Zeichenfolge zusammen mit den Daten für die vorangehende Abruf-Zeichenfolge von dem Abschnitt 4022 in den Abschnitt für Eigennamen gegeben, wo die Verarbeitung für den nicht im Wörterbuch registrierten Eigennamen durchgeführt wird (4120).If the first character is an uppercase letter, the data for the retrieval string along with the data for the previous retrieval string are passed from the section 4022 to the section for proper names, where the processing for the non-dictionary registered proper name is performed ( 4120 ).

Nunmehr wird die Verarbeitung von nicht im Wörterbuch registrierten Eigennamen anhand von Fig. 34 beschrieben. Die Daten für die Abruf-Zeichenfolge werden zusammen mit den Daten für die vorhergehende Abruf-Zeichenfolge an den Verarbeitungsabschnitt 4026 abgegeben, der dann beurteilt, ob das Ende der vorhergehenden Eingabe ein Kanidat für das Ende des Satzes ist oder nicht (4300). Die Beurteilung, ob es ein Kanidat für das Ende des Satzes ist oder nicht, wird mittels der Beurteilung durchgeführt, ob das Ende der vorhergehenden Eingabe ein Kanidat für das Ende des Satzes ist, wie beispielsweise ein gesonderter Punkt (.), usw. oder nicht.The processing of proper names not registered in the dictionary will now be described with reference to FIG. 34. The data for the fetch string, along with the data for the previous fetch string, is provided to the processing section 4026 , which then judges whether or not the end of the previous input is a candidate for the end of the sentence ( 4300 ). The judgment as to whether or not it is a candidate for the end of the sentence is made by judging whether the end of the previous input is a candidate for the end of the sentence, such as a separate period (.), Etc. or not .

Wenn das Ende des vorhergehenden Eingangs ein Kandidat für das Ende des Satzes ist, werden Daten von dem Verarbeitungsabschnitt 4026 an den Verarbeitungsabschnitt 4028 abgegeben, welcher dann die vorhergehende Eingabe als das Ende des Satzes betrachtet (4302), das erste Zeichen in der Abruf- Zeichenfolge in einen kleinen Buchstaben umwandelt und ihn an den Abschnitt 4022 abgibt.If the end of the previous input is a candidate for the end of the sentence, data is passed from processing section 4026 to processing section 4028 , which then considers the previous input as the end of the sentence ( 4302 ), the first character in the retrieval string converted to a small letter and passed it to section 4022 .

Der Abschnitt 4022 ruft das Wörterbuch 4020 für die Abruf- Zeichenfolge ab, die in das kleingeschriebene Zeichen umgewandelt worden ist (4304) und beurteilt, ob es einen Eingang in dem Bezugswörterbuch 4020 gibt oder nicht (4306). Wenn es einen Eingang gibt, gibt der Abschnitt 4022 die aus dem Wörterbuch 4020 abgerufenen Daten an den Verarbeitungsabschnitt 4036 ab und speichert sie in dessen Tabelle 4036 a (4308). Wenn es keinen Eingang gibt, stellt der Abschnitt 4022 das erste Zeichen in der Abruf-Zeichenfolge wieder auf das großgeschriebene Zeichen um, gibt dasselbe als einen nichtregistrierten Eigennamen an den Verarbeitungsabschnitt 4036 ab, wodurch er in dessen Tabelle 4036 a aufgezeichnet wird (4310).Section 4022 retrieves dictionary 4020 for the fetch string that has been converted to the lower case character ( 4304 ) and judges whether there is an input to reference dictionary 4020 or not ( 4306 ). If there is an input, section 4022 passes the data retrieved from dictionary 4020 to processing section 4036 and stores it in table 4036 a ( 4308 ). If there is no input, section 4022 reverts the first character in the fetch string to the uppercase character, passes it as an unregistered proper name to processing section 4036 , thereby recording it in its table 4036 a ( 4310 ).

Wenn beim Schritt 4300 der Verarbeitungsschritt 4026 für das vorhergehende Satzende beurteilt, daß das Ende des vorhergehenden Eingangs kein Kanidat für das Ende des Satzes ist, werden die Daten von dem Abschnitt 4026 an den Abschnitt 4038 angegeben, der dann den vorhergehenden Eingang so beurteilt, daß er nicht das Ende des Satzes ist (4312). Die Daten von dem Abschnitt 4028 werden an den Abschnitt 4030 abgegeben, welcher dann die Abruf-Zeichenfolge als einen Eigennamen betrachtet, dessen Merkmalsinformation unbekannt ist (4314).At step 4300 , if the previous sentence end processing step 4026 judges that the end of the previous entry is not a candidate for the end of the sentence, the data is passed from section 4026 to section 4038 which then judges the previous entry so that it is not the end of the sentence ( 4312 ). The data from section 4028 is provided to section 4030 , which then views the fetch string as a proper name whose feature information is unknown ( 4314 ).

Der Abschnitt 4030 bringt dann die Daten in den Verarbeitungsabschnitt 4028 zurück, welcher dann die Verarbeitung für den im Wörterbuch registrierten Eigennamen durchführt (4316). Die Verarbeitung für den im Wörterbuch registrierten Eigennamen ist diesselbe wie diejenige, welche in Fig. 33 dargestellt ist.Section 4030 then returns the data to processing section 4028 , which then performs processing for the proper name registered in the dictionary ( 4316 ). The processing for the proper name registered in the dictionary is the same as that shown in Fig. 33.

Wenn nunmehr in Fig. 32 die Referenzeinheit beim Schritt 4104 am Ende ist, gibt der Abschnitt 4022 ein Signal ab, welches dies dem Abschnitt 4034 anzeigt, welcher dann die in der Tabelle 4036 a im Abschnitt 4036 aufgezeichnete Information ausliest und den Eigennamen mit der Vorgabe-Merkmalsinformation schafft (4122).If the reference unit at step 4104 is now at the end in FIG. 32, section 4022 emits a signal which indicates this to section 4034 , which then reads out the information recorded in table 4036 a in section 4036 and the proper name with the default -Feature information creates ( 4122 ).

Nunmehr wird anhand von Fig. 35 die Arbeitsweise beschrieben, um den Eigennamen mit der Vorgabe-Merkmalsinformation zu versehen. In dem hierfür vorgesehenen Abschnitt 4034 wird zuerst ein Zeiger an die Oberseite der Daten in der Tabelle 4036 a gesetzt (4400). Das heißt, der Zeiger wird an dem Eingang an die Oberseite des Eingabesatzes gesetzt, welche in Eingänge aufgeteilt sind, welche jeweils mit Information durch das Abrufen in dem Bezugswörterbuch 4020 versehen sind. Es wird dann beurteilt, ob der durch den Zeiger aufgezeigte Eingang ein Eigenname ist oder nicht (4402); wenn es ein Eigenname ist, wird beurteilt, ob die Merkmalsinformation des Eigennamens unbekannt ist oder nicht (4404). Wenn es kein Eigenname ist, wird im Flußdiagramm auf den Schritt (4408) übergegangen, und der Zeiger auf den nächsten Eingang vorgerückt.The operation will now be described with reference to FIG. 35 to provide the proper name with the default feature information. In the section 4034 provided for this purpose, a pointer is first set to the top of the data in the table 4036 a ( 4400 ). That is, the pointer is placed on the input at the top of the input set, which are divided into inputs, each of which is provided with information by retrieval in the reference dictionary 4020 . It is then judged whether the input indicated by the pointer is a proper name or not ( 4402 ); if it is a proper name, it is judged whether or not the characteristic information of the proper name is unknown ( 4404 ). If it is not a proper name, the flowchart proceeds to step ( 4408 ) and the pointer advances to the next input.

Wenn beim Schritt 4404 die Merkmalsinformation für den Eigennamen unbekannt ist, wird die Vorgabe-Merkmalsinformation vorgesehen (4406). Die Vorgabe-Merkmalsinformation wird bei einem Eigennamen vorgesehen, dessen Merkmalsinformation unbekannt ist, wie im unteren Teil der Fig. 36 dargestellt ist. Dort wird dann beispielsweise der Eigenname "Johnson" des Merkmalsinformation unbekannt ist, mit allen Arten von Merkmalsinformationen versehen, d. h. "Person, Ort, Gruppe und anderes". Dadurch, daß der Eigenname, dessen Merkmalsinformation unbekannt ist, mit allen Arten von Merkmalsinformationen versehen wird, ist es möglich, einen Raum übrig zu lassen, damit der Eigenname in eine der Vielzahl Merkmale in der anschließenden Strukturanalyse zerlegt werden kann.If the feature information for the proper name is unknown at step 4404 , the default feature information is provided ( 4406 ). The default feature information is provided for a proper name whose feature information is unknown, as shown in the lower part of FIG. 36. There, for example, the proper name "Johnson" of the feature information is unknown and is provided with all types of feature information, ie "person, place, group and other". By providing the proper name, whose feature information is unknown, with all types of feature information, it is possible to leave a space so that the proper name can be broken down into one of the many features in the subsequent structural analysis.

Wenn beim Schritt 4404 die Merkmalsinformation für den Eigennamen nicht bekannt ist, wird im Flußdiagramm auf den Schritt 4408 vorgerückt, und der Zeiger wird auf einen weiteren Eingang vorgerückt. Es wird dann beurteilt, ob der durch den Zeiger angezeigte Eingang am Ende liegt oder nicht (4408), und wenn er nicht am Ende liegt, wird im Flußdiagramm auf den Schritt 4402 zurückgekehrt, und es wird dann der nächste Eingang überprüft, ob er ein Eigenname ist oder nicht. Wenn der durch den Zeiger aufgezeigte Eingang am Eingang liegt, wird das Vorsehen der Vorgabeinformation beendet. Nachdem das Vorsehen der Vorgabe-Merkmalsinformation an dem Eigennamen beendet ist, werden die Daten, die in der Tabelle 4036 a aufgezeichnet sind, von dem Abschnitt 4036 an den Abschnitt 4036 für syntaktische Analyse abgegeben (4124), wodurch dann die Morphem-Analyse in dieser Ausführung beendet ist.If at step 4404 the feature information for the proper name is not known, the flowchart advances to step 4408 and the pointer is advanced to another input. It is then judged whether the input indicated by the pointer is at the end or not ( 4408 ), and if it is not at the end, the flow returns to step 4402 and the next input is checked to see if it is on Proper name is or not. If the input indicated by the pointer is at the input, the provision of the default information is ended. After the provision of the default feature information on the proper name is finished, the data recorded in the table 4036 a is released from the section 4036 to the section 4036 for parsing ( 4124 ), thereby performing the morpheme analysis therein Execution is complete.

Nunmehr wird die Arbeitsweise der vorstehend beschriebenen Einrichtung beispielsweise an einem eingegebenen Satz erläutert. Die Erläuterung erfolgt für den Fall, daß ein Satz "am Bahnhof Tokyo traf mr. Walter Johnson" ("In Tokyo Station Mir. Walter met Johnson") eingegeben wird. Zuerst wird eine Eingangsverarbeitung (4100) durchgeführt, um den eingegebenen Satz in den Verarbeitungsabschnitt 4014 einzulesen. Dann wird die Wörterbuch-Bezugseinheit aufgeteilt (4102) und der eingegebene Satz wird durch Zwischenräume jeweils in Worte aufgeteilt. Das Bezugswörterbuch 4020 wird zuerst bezüglich "auf" ("In") abgerufen (4106). Da es keinen Eingang für "Auf" gibt, das sich in dem Bezugswörterbuch 4020 befindet, geht der Schritt zuerst auf die Verarbeitung für den im Wörterbuch nicht registrierten Eigennamen über. Da jedoch der vorhergehende Teil an der Oberseite der eingegebenen Zeichenfolge liegt, wird er als ein Kandidat für das Ende des Satzes betrachtet, der als "auf bzw. in" abgerufen worden ist, indem I bzw. A in a bzw. i aus dem Bezugswörterbuch 4020 umgesetzt wird. Da es keinen Eigennamen gibt (4110) werden die aus dem Wörterbuch 4020 abgerufenen Daten in der Tabelle 4036 a aufgezeichnet (4112). Dann wird das Wörterbuch 4020 bezüglich "Tokyo" abgerufen (4106). Da es keinen Eingang in dem Wörterbuch 4020 für "Tokyo" gibt (4108) und das erste Zeichen ein großer Buchstabe ist (4116), wird eine Verarbeitung für den nicht im Wörterbuch registrierten Eigennamen durchgeführt (4120). Dann wird im Flußdiagramm auf Fig. 34 vorgerückt. Da der vorhergehende Teil "Auf bzw. In" ist, ist dies kein Kanidat für das Ende des Satzes (4300); "Auf" bzw. "In" wird nicht als das Ende des Satzes beurteilt (4312) "Tokyo" wird als ein Eigenname erkannt, dessen Merkmalsinformation unbekannt ist (4314) und es wird die Verarbeitung für den im Wörterbuch registrierten Eigennamen durchgeführt (4316).The operation of the device described above will now be explained, for example, on an input sentence. The explanation is given in the event that a sentence "met at Walter Bahnhof Tokyo mr. Walter Johnson"("In Tokyo Station Mir. Walter met Johnson") is entered. First, input processing ( 4100 ) is performed to read the input sentence into the processing section 4014 . Then the dictionary reference unit is divided ( 4102 ) and the input sentence is divided into words by spaces. The reference dictionary 4020 is first retrieved for "on"("In") ( 4106 ). Since there is no "up" input located in the reference dictionary 4020 , the step first goes to processing for the proper name not registered in the dictionary. However, since the previous part is at the top of the string entered, it is considered a candidate for the end of the sentence that has been retrieved as "on or in" by adding I or A in a or i from the reference dictionary 4020 is implemented. Since there is no proper name ( 4110 ), the data retrieved from dictionary 4020 is recorded in table 4036 a ( 4112 ). Dictionary 4020 is then retrieved regarding "Tokyo" ( 4106 ). Since there is no entry in dictionary 4020 for "Tokyo" ( 4108 ) and the first character is a capital letter ( 4116 ), processing is performed for the proper name not registered in the dictionary ( 4120 ). Then, the flowchart in Fig. 34 is advanced. Since the previous part is "On or In", this is not a candidate for the end of the sentence ( 4300 ); "On" or "In" is not judged to be the end of the sentence ( 4312 ) "Tokyo" is recognized as a proper name whose feature information is unknown ( 4314 ) and processing is carried out for the proper name registered in the dictionary ( 4316 ) .

Im Flußdiagramm wird auf Fig. 33 vorgerückt. Da "Auf bzw. In" in dem vorhergehenden Teil weder ein nichtregistrierter (4200) noch ein registrierter Eigennamen (4204) ist, wird er als ein Eigenname, der an sich eine ihm eigene Merkmalsinformation hat, d. h. als ein Eigenname aufgezeichnet, dessen Merkmalsinformation unbekannt ist (4216). In Fig. 32 wird dann das Abrufen für das Wörterbuch 4020 für "Bahnhof" ("Station") durchgeführt (4108). Da es einen Eingang in dem Wörterbuch 4020 für "Bahnhof" gibt (4108), und da es ein Eigenname ist (4110) wird eine Verarbeitung für den im Wörterbuch registrierten Eigennamen durchgeführt (4114). Es wird nun auf das Flußdiagramm der Fig. 33 übergegangen. Da "Tokyo" in dem vorhergehenden Teil ein nicht registrierter Eigennamen ist (4200), wird "Bahnhof Tokyo" ("Tokoy Station") als Ganzes als ein Eigenname aufgezeichnet, welcher die Merkmalsinformation "Ort und Gruppe" in Form des Ausdrucks "Bahnhofs" (bzw. "Station") hat (4202). Dann wird das Bezugswörterbuch 4020 in Fig. 32 für "Mr." abgerungen (4106), da "Mr." ein Eingang in dem Wörterbuch 420 und ein Eigenname ist (4110), wird die Verarbeitung für den im Wörterbuch registrierten Eigennamen durchgeführt (4114). Dann rückt der Fluß auf Fig. 33 vor. "Bahnhof" ("Station") in dem vorherigen Teil ist nicht ein nichtregistrierte Eigennamen (4200), sondern ein registrierter Eigennamen 4204) und die Merkmalsinformation (Ort, Gruppe) sind nicht unbekannt (4206). Da "Mr." eine Merkmalsinformation "Person" ist und unbekannt ist (4208) wird beurteilt, ob irgendein gemeinsamer Teil in der Merkmalsinformation "Bahnhof" in dem vorhergehenden Teil und "Mr." vorhanden ist (4212). Da "Bahnhof" "Ort, Gruppe" bedeutet, während "Mr." "Person" bedeutet und es keinen gemeinsamen Teil zwischen ihnen gibt, wird "Mr." allein als ein Eigennamen mit der Merkmalsinformation "Person" registriert. (4216). Es wird dann wieder auf das Flußdiagramm in Fig. 32 zurückgegangen, und das Bezugswörterbuch 4020 wird für "Walter" abgerufen (4106). Da es einen Eingang für "Walter" in dem Wörterbuch (4020) gibt (4108) und dies ein Eigenname ist (84110) wird eine Verarbeitung für den registrierten Eigennamen durchgeführt (4114). Es wird dann wieder auf das Flußdiagramm der Fig. 33 übergegangen. Da "Mr." in dem vorhergehenden Teil nicht ein nichtregistrierter Eigennamen (4200) sondern ein registrierter Eigennamen ist (4204), und da der Merkmalsinformation "Person" nicht unbekannt ist (4206) und ferner die Merkmalsinformation für "Walter" "Person, Ort Gruppe" ebenfalls nicht unbekannt ist, wird der gemeinsame Teil für die Merkmalsinformation beurteilt (4212). Da es einen eigenen Teil ("Person") gibt, der für die Merkmalsinformation vorhanden ist, werden "Mr. Walter" gemeinsam als ein gemeinsames Hauptwort mit der Merkmalsinformation "Person" aufgezeichnet (4214).The flowchart advances to Fig. 33. Since "On or In" in the previous part is neither an unregistered ( 4200 ) nor a registered proper name ( 4204 ), it is recorded as a proper name which has inherent characteristic information, ie as a proper name, the characteristic information of which is unknown is ( 4216 ). In Fig. 32, the retrieval for dictionary 4020 for "station" is then performed ( 4108 ). Since there is an entry in dictionary 4020 for "station" ( 4108 ) and since it is a proper name ( 4110 ), processing is performed for the proper name registered in the dictionary ( 4114 ). The flow chart of FIG. 33 will now be transitioned to. Since "Tokyo" is an unregistered proper name in the previous part ( 4200 ), "Tokyo Station"("TokoyStation") is recorded as a whole as a proper name, which contains the feature information "location and group" in the form of the expression "station" (or "station") has ( 4202 ). Then the reference dictionary 4020 in Fig. 32 for "Mr." wrestled ( 4106 ) because "Mr." is an entry in dictionary 420 and a proper name ( 4110 ), processing is performed for the proper name registered in the dictionary ( 4114 ). Then the flow advances to Fig. 33. "Station" in the previous part is not an unregistered proper name ( 4200 ), but a registered proper name 4204 ) and the feature information (location, group) are not unknown ( 4206 ). Because "Mr." feature information is "person" and is unknown ( 4208 ), it is judged whether any common part in the feature information is "station" in the previous part and "Mr." exists ( 4212 ). Since "station" means "place, group", while "Mr.""Person" means and there is no common part between them, "Mr." registered alone as a proper name with the feature information "person". ( 4216 ). The flowchart in FIG. 32 is then returned to and the reference dictionary 4020 is retrieved for "Walter" ( 4106 ). Since there is an input for "Walter" in the dictionary ( 4020 ) ( 4108 ) and this is a proper name ( 84110 ), processing for the registered proper name is performed ( 4114 ). The flowchart of FIG. 33 is then returned to. Because "Mr." in the previous part is not an unregistered proper name ( 4200 ) but a registered proper name ( 4204 ), and since the feature information "person" is not unknown ( 4206 ) and furthermore the feature information for "Walter""person, place group" is also not unknown the common part for the feature information is judged ( 4212 ). Since there is a separate part ("person") that is available for the feature information, "Mr. Walter" is recorded together as a common noun with the feature information "person" ( 4214 ).

Dann wird "traf" ("met") aus dem Wörterbuch 420 abgerufen. Und da es eine Eingabe gibt (4108) und es kein Eigenname ist (4110) werden die aus dem Wörterbuch 4020 aufgezeichneten Daten in der Tabelle 4036 a aufgezeichnet (4112). Ferner wird "Johnson" aus dem Wörterbuch 4020 abgerufen (4106). Da es keinen Eingang für "Johnson" gibt (4108) und das erste Zeichen ein groß geschriebenes Zeichen ist (4116) wird eine Verarbeitung für den nicht im Wörterbuch registrierten Eigennamen durchgeführt (4120). Dann wird auf das Flußdiagramm der Fig. 34 übergegangen. Da "traf" in dem vorgehenden Teil keinen Kandidaten für das Ende des Satzes hat (4300), wird beurteilt, ob "traf" nicht das Ende des ist (4312), "Johnson" wird als Eigennamen mit unbekannter Merkmalsinformation betrachtet (4314) und es wird eine Verarbeitung für den im Wörterbuch registrierten Eigennamen durchgeführt (4316). Dann wird wieder auf das Flußdiagramm der Fig. 33 übergegangen. Da "traf" in dem vorhergehenden Teil weder als nichtregistrierter (4200) noch als registrierter Eigenname (4204) vorhanden ist, wird "Johnson" allein als ein Eigenname aufgezeichnet, dessen Merkmalsinformation unbekannt ist.Then "met" is retrieved from dictionary 420 . And since there is an input ( 4108 ) and it is not a proper name ( 4110 ), the data recorded from dictionary 4020 are recorded in table 4036 a ( 4112 ). "Johnson" is also retrieved from dictionary 4020 ( 4106 ). Since there is no input for "Johnson" ( 4108 ) and the first character is an uppercase character ( 4116 ), processing is performed for the proper name not registered in the dictionary ( 4120 ). The flowchart of FIG. 34 is then transferred to. Since "met" in the previous part has no candidate for the end of the sentence ( 4300 ), it is judged whether "met" is not the end of the ( 4312 ), "Johnson" is considered a proper name with unknown feature information ( 4314 ) and processing is performed for the proper name registered in the dictionary ( 4316 ). The flowchart of FIG. 33 is then returned to. Since "met" does not exist as a non-registered ( 4200 ) or registered proper name ( 4204 ) in the previous part, "Johnson" is recorded alone as a proper name whose feature information is unknown.

Nach dem vorhergehenden Verarbeitungsschritten wird eine Vorgabe-Merkmalsinformation für den Eigennamen vorgesehen, wie in Fig. 35 dargestellt ist. Der Zeiger wird auf "Auf" ("In") als das obere Ende der Wörterbuch-Bezugseinheit gesetzt (4400). Da es kein Eigenname ist (4402), wird der Zeiger vorgerückt (4408) und auf "Bahnhof Tokyo" ("Tokyo Station") gesetzt. Da "Bahnhof Tokyo" ein Eigenname ist (4402) und die Merkmalsinformation nicht unbekannt ist, da der Gesamtteil "Bahnhof Tokyo" als Ort, Gruppe bei der vorherigen Verarbeitung für den registrierten Eigennamen erkannt worden ist (4404) wird der Zeiger vorgerückt (4408) und auf "Mr. Walter" gesetzt.After the previous processing step, default feature information is provided for the proper name, as shown in FIG. 35. The pointer is set to "In" as the upper end of the dictionary reference unit ( 4400 ). Since it is not a proper name ( 4402 ), the pointer is advanced ( 4408 ) and set to "Tokyo Station"("TokyoStation"). Since "Tokyo Station" is a proper name ( 4402 ) and the feature information is not unknown, since the entire part "Tokyo Station" was recognized as a place, group in the previous processing for the registered proper name ( 4404 ), the pointer is advanced ( 4408 ) and set to "Mr. Walter".

Da "Mr. Walter" auch ein Eigenname ist (4402) und die Merkmalsinformation nicht unbekannt ist (4404) wird der Zeiger vorgerückt (4408). Da "traf" kein Eigenname ist (4402) wird der Zeiger vorgerückt (4408). Da "Johnson" ein Eigennamen ist (4402), dessen Merkmalsinformation unbekannt ist, wird eine Vorgabe-Merkmalsinformation vorgesehen (4406) und "Johnson" wird mit einer Merkmalsinformation "Person, Ort Gruppe u. ä." versehen, wie in Fig. 36 dargestellt ist.Since "Mr. Walter" is also a proper name ( 4402 ) and the feature information is not unknown ( 4404 ), the pointer is advanced ( 4408 ). Since "met" is not a proper name ( 4402 ) the pointer is advanced ( 4408 ). Since "Johnson" is a proper name ( 4402 ), the feature information of which is unknown, a default feature information is provided ( 4406 ) and "Johnson" is provided with feature information "person, place group, etc." provided as shown in Fig. 36.

Wie vorstehend in dieser Ausführungsform beschrieben wird, wird ein deutscher (englischer) eingegebener Satz in Abruf- Zeichenfolgen unterteilt, welche dann aus dem Bezugswörterbuch 4020 abgerufen werden, und wenn es einen Eingang als Eigennamen in dem Wörterbuch 4020 gibt, wird eine Verarbeitung für den registrierten Eigennamen durchgeführt. Bei der Verarbeitung des registrierten Eigennamens wird die vorhergehende Abruf-Zeichenfolge berücksichtigt, und wenn sie ein Eigennamen ist, werden die Merkmalsinformationen für die vorhergehende Abruf-Zeichenfolge und den Eigennamen als das Objekt überprüft. Wenn es keine Merkmalsinformation hat, wird die andere Merkmalsinformation vorgesehen, während, wenn irgendeine Merkmalsinformation für beides vorhanden ist, wird der gemeinsame Teil als die Merkmalsinformation dieser Eigennamen betrachtet. Folglich ist es möglich, einen Eigennamen, der keine Merkmalsinformation hat in geeigneter Weise mit einer Merkmalsinformation zu versehen, und die vorgesehene Merkmalsinformation auf eine geeignetere Merkmalsinformation zu beschränken. Dies ermöglicht eine wirksamere Analyse in der nachfolgenden Strukturanalyse und beim Durchführen einer entsprechenden Übersetzung.As described above, in this embodiment, a German is divided (English) input sentence in retrieve character strings, which are then retrieved from the reference dictionary 4020, and if there is an input as a proper name in the dictionary 4020, a processing for the registered Proper names performed. When processing the registered proper name, the previous retrieval string is taken into account, and if it is a proper name, the feature information for the previous retrieve string and proper name as the object is checked. If it has no feature information, the other feature information is provided, while if there is any feature information for both, the common part is regarded as the feature information of these proper names. Consequently, it is possible to appropriately provide a proper name with no feature information with feature information and to restrict the provided feature information to more suitable feature information. This enables a more effective analysis in the subsequent structural analysis and when performing a corresponding translation.

Wenn ferner für die Zeichenfolge, die nicht in dem Wörterbuch 4020 registriert ist, das erste Zeichen ein groß geschriebenes Zeichen ist und festgestellt wird, daß die vorhergehende Zeichenfolge das Ende des Satzes ist, da das groß geschriebene Zeichen in ein klein geschriebenes Zeichen umgewandelt ist und das Wörterbuch 4020 wieder abgerufen wird, ist es möglich, auch die Zeichenfolge an dem oberen Ende des Satzes in dem Wörterbuch 4020 abzurufen. Wenn ferner eine Zeichenfolge, die mit einem groß geschriebenen Zeichen beginnt, in einem anderen Teil als an dem oberen Teil des Satzes aufscheint, wird dies als ein Eigenname beurteilt, und die Merkmalsinformation für den Eigennamen wird mittels eines Eigennamens mit registrierter Merkmalsinformation versehen, wenn sie davor und danach existiert. Folglich kann ein Eigenname, welcher nicht in dem Wörterbuch 4020 registriert ist, in gewissem Umfang grammatikalisch zergliedert werden.Further, if the character string not registered in the dictionary 4020 , the first character is an uppercase character and it is determined that the previous character string is the end of the sentence because the uppercase character is converted to an uppercase character and dictionary 4020 is retrieved, it is possible to also retrieve the string at the top of the sentence in dictionary 4020 . Further, when a character string beginning with an uppercase character appears in a part other than the upper part of the sentence, it is judged as a proper name and the characteristic information for the proper name is provided with registered characteristic information by means of a proper name if it before and after exists. As a result, a proper name that is not registered in the dictionary 4020 can be broken down grammatically to some extent.

Da ferner ein Eigenname der nicht mit einer Merkmalsinformation versehen ist, mit allen notwendigen Merkmalsinformationen versehen wird und die nichterforderliche Merkmalsinformation bei der Verarbeitung für das Wort entfernt wird, ist es möglich, einen Eigennamen, dessen Merkmalsinformation nicht bekannt ist oder einen nicht registrierten Eigennamen zu analysieren.Furthermore, since a proper name does not have any feature information is provided with all necessary feature information is provided and the unnecessary feature information removed in processing for the word is, it is possible to use a proper name whose characteristic information is not known or an unregistered proper name analyze.

Da ferner eine Anzahl Merkmalsinformationen bei einem ganz bestimmten Eigennamen vorgesehen sind und geeignete Merkmalsinformationen in Abhängigkeit von der Merkmalsinformation des Eigennamens davor und danach ausgewählt werden, ist es möglich, entsprechende Merkmalsinformationen für den Fall auszuwählen, daß ein Eigennamen mit einer Anzahl von Merkmalsinformationen in Beziehung zu anderen davor und danach grammatikalisch zergliedert wird, um dadurch ein wirksames Zergliedern des eingegebenen Satzes zu ermöglichen.Furthermore, since there is a number of feature information for a whole certain proper names are provided and suitable characteristic information depending on the feature information of the proper name before and after, it is possible to provide appropriate feature information for select the case that a proper name with a number of feature information in relation to others before and then broken down grammatically to thereby to enable effective division of the entered sentence.

Nunmehr wird eine fünfte Ausführungsform der Erfindung beschrieben, wobei in Fig. 38 der gesamte Aufbau der fünften Ausführungsform eines Sprachanalysators gemäß der Erfindung dargestellt ist, der bei einer automatischen Übersetzungseinrichtung für Deutsch (Englisch)-Japanisch angewendet wird. Diese Ausführungsform weist einen Eingabeabschnitt 5010 auf, und ein deutscher/englischer Text 5020, welcher ins japanische zu übersetzen ist, wird über den Abschnitt eingegeben. Der Eingabeabschnitt 5010 kann beispielsweise ein Tastenfeld mit Zeichentasten, wie alphanumerischen oder Funktions-Tasten, eine optische Zeichenaufzeichnungseinrichtung (OCR), um den englischen/deutschen Text, der auf Papier aufgezeichnet ist, zu lesen, und/oder eine Dateispeichereinrichtung zum Lesen des deutschen/englischen Textes sein, welcher auf ein Speichermedium, wie eine Magnetplatte aufgezeichnet ist.A fifth embodiment of the invention will now be described, and FIG. 38 shows the entire structure of the fifth embodiment of a language analyzer according to the invention, which is applied to an automatic translation device for German (English) -Japanese. This embodiment has an input section 5010 , and German / English text 5020 to be translated into Japanese is entered through the section. The input section 5010 may include , for example, a keypad with character keys such as alphanumeric or function keys, an optical character recording device (OCR) to read the English / German text recorded on paper, and / or a file storage device for reading the German / English text, which is recorded on a storage medium, such as a magnetic disk.

Der von dem Eingabeabschnitt 5010 eingegebene deutsche/ englische Text wird in einen Vorredigierabschnitt 5014 gelesen, wo eine Vorbehandlung für die Übersetzung durchgeführt wird. In diesem Fall werden eine Satzerkennung und die Verarbeitung von unbekannten Wörtern hauptsächlich durchgeführt. Diese Funktionen sind Teil der morphologischen Analyse. Die vorredigierten deutschen/englischen Daten werden zusammen mit der Information, die bei der Vorredigierung erhalten worden ist, an einen Abschnittt 5016 für die morphologische Analyse übertragen. Der Abschnitt 5016 teilt die Daten in Sätze auf, während ein Wort/Wörterbuch 5018 abgerufen wird, zergliedert deutsche/englische Morpheme, führt eine Verarbeitung für unbekannte Worte, Eigennamen, verschiedene Zusammensetzungen, wie einen Zeitausdruck, und führt Verarbeitungen für den gesamten Satz durch, wie eine Zusatzfrage (tag question) und eine Apositionserkennung. Die Regeln für die morphologische Analyse sind in einer Regeldatei 5036 gespeichert.The German / English text input from the input section 5010 is read into a pre-editing section 5014 , where pre-treatment for the translation is performed. In this case, sentence recognition and processing of unknown words are mainly performed. These functions are part of the morphological analysis. The pre-edited German / English data along with the information obtained from the pre-editing are transferred to a section 5016 for morphological analysis. Section 5016 divides the data into sentences while a word / dictionary 5018 is fetched, breaks down German / English morphemes, processes for unknown words, proper names, various compositions such as a time expression, and performs processing for the entire sentence, like an additional question (tag question) and a recognition of position. The rules for the morphological analysis are stored in a rules file 5036 .

Die deutschen/englischen Daten werden nach der morphologischen Analyse zusammen mit der Wörterbuchinformation, welche bei der morphologischen Analyse erhalten worden ist, an einen Abschnitt I 5020 für syntaktische Analyse übertragen. Der Abschnitt I 5020 ist ein Funktionsabschnitt, welcher die Oberflächenstruktur für den Satz grammatikalisch zergliedert, indem grammatikalische Regeln bei den deutschen/englischen Daten angewendet werden, und er findet jede strukturelle Möglichkeit heraus.After the morphological analysis, the German / English data are transmitted to a section I 5020 for syntactic analysis together with the dictionary information obtained in the morphological analysis. Section I 5020 is a functional section that grammatically breaks up the surface structure for the sentence by applying grammatical rules to the German / English data, and it finds out every structural possibility.

Die deutschen/englischen Daten, die in dem Abschnitt I 5020 der syntaktischen Analyse unterzogen worden sind, werden zusammen mit der analysierten Information an einen Abschnitt II 5022 für syntaktische Analyse abgegeben, wobei eine Lösung aus dem Ergebnis der syntaktischen Analyse im Hinblick auf die Oberflächenstruktur durch die syntaktische Analyse I ausgewählt wird, indem die Strukturbeschreibung angewendet wird. Ein plausibler Parsingbaum für den deutschen/englischen Satz wird folglich vorbereitet und dessen Struktur gebildet. Die Regeln für eine syntaktische Analyse sind ebenfalls in der Regeldatei 5036 gespeichert. The German / English data which have been subjected to the syntactic analysis in section I 5020 are given together with the analyzed information to a section II 5022 for syntactic analysis, whereby a solution is obtained from the result of the syntactical analysis with regard to the surface structure the syntactic analysis I is selected using the structural description. A plausible parsing tree for the German / English sentence is therefore prepared and its structure is formed. The rules for a syntactic analysis are also stored in the rules file 5036 .

Die deutschen/englischen Daten werden nach Durchführung der syntaktischen Analyse als die Daten des sogenannten Parsing- Baums an einen Struktur-Umwandlungsabschnitt 5024 übertragen. In dem Abschnitt 5024 wird ein Strukturbaum des entsprechenden japanischen Satzes aus dem Strukturbaum vorbereitet, welcher eine Zwischenstruktur des deutschen/englischen Satzes ist, und wird in eine dem japanischen unterliegende Struktur umgesetzt, von der aus der japanische Satz leicht übersetzt werden kann. Die Daten für den Strukturbaum, welcher die dem japanischen zugrundeliegende Struktur anzeigt, welche der Strukturumwandlung unterzogen worden ist, werden an einen Übersetzungsabschnitt 5026 abgegeben, in welchem ein Übersetzersatz erzeugt wird. Dies ist ein Funktionsabschnitt, welcher einen japanischen Satz aus der Baumstruktur des japanischen Strukturbaums erzeugt.The German / English data are transmitted to the structure conversion section 5024 as the data of the so-called parsing tree after the syntactical analysis has been carried out. In section 5024 , a structure tree of the corresponding Japanese sentence is prepared from the structure tree, which is an intermediate structure of the German / English sentence, and is converted into a structure underlying the Japanese, from which the Japanese sentence can be easily translated. The data for the structure tree, which indicates the underlying Japanese structure that has been subjected to the structure conversion, is supplied to a translation section 5026 , in which a translation replacement is generated. This is a functional section that creates a Japanese sentence from the tree structure of the Japanese structure tree.

Japanische Daten als Ergebnis der Übersetzung, d. h. Daten für den übersetzten Satz werden dann an einen Nachredigierabschnitt 5030 angegeben. Der Abschnitt 5030 modifiziert die übersetzten Daten durch Abrufen eines Wörterbuchs 5018 mit Hilfe von Information, welche bei der Übersetzung benutzt worden ist, um einen natürlicheren japanischen Satz zu vervollkommnen. Die Daten für den japanischen Satz werden an einen Ausgabeabschnitt 5032 und als ein Übersetzter japanischer Satz 5034 von dem Ausgabeabschnitt 5032 aus abgegeben. Der Abgabeabschnitt 5032 kann beispielsweise einen Drucker, eine Anzeige und eine Datei-Speichereinrichtung, wie eine Magnetplatte, aufweisen. Ein Fluß einer solchen Serie von Übersetzungsvorgängen wird durch einen Steuerabschnitt 5038 gesteuert, welcher die Steuerung der gesamten Einrichtung regelt. Das Wort-Wörterbuch 5018 speichert Wörterbuchdaten für deutsche/englische und japanische Worte und verschiedene Informationen werden darin festgelegt, wie eine verbindende Beziehung, d. h. eine vorhandene Beziehung, sowie Bedeutungen, eine Singular- oder Pluralform, ein Sprachteil usw. und dies alles zusätzlich zu dem Vokabular. Ferner speichert eine Datei 5036 Regeldaten für die morphologische und syntaktische Analyse.Japanese data as a result of the translation, that is, data for the translated sentence, is then given to a post-editing section 5030 . Section 5030 modifies the translated data by retrieving a dictionary 5018 using information that has been used in the translation to perfect a more natural Japanese sentence. The data for the Japanese set is output to an output section 5032 and as a translated Japanese set 5034 from the output section 5032 . The dispensing section 5032 may include, for example, a printer, a display, and a file storage device such as a magnetic disk. A flow of such a series of translations is controlled by a control section 5038 which controls the control of the entire facility. The word dictionary 5018 stores dictionary data for German / English and Japanese words, and various information is set therein, such as a connecting relationship, that is, an existing relationship, as well as meanings, a singular or plural form, a language part, etc., and all of this in addition to that Vocabulary. A file 5036 also stores control data for morphological and syntactic analysis.

Der Steuerabschnitt 5038 ist mit einem Bedienungs-Anzeigeabschnitt 5040 verbunden, welcher Bedienungstasten hat, um verschiedene Anzeigen von einem Operator an die erfindungsgemäße Einrichtung zu schaffen, so beispielsweise eine Übersetzungs-Anzeigetaste, eine Cursortaste usw. und ein Display oder eine Anzeige, welche visuelle einen eingegebenen deutschen/englischen Text, einen japanischen Text als Ergebnis der Übersetzung, Zwischendaten, wie Wörterbuchinformation, verschiedene Anzeigen für den Operator, usw. darstellt. Sie kann auch so ausgeführt sein, daß das meiste der Bedienungs-Anzeigefunktionen in einem Tastenfeld vorgesehen ist, wenn dies an dem Eingabeabschnitt 5010 angeordnet ist oder in einem Display vorgesehen ist, wenn es an dem Ausgabeabschnitt 5032 angeordnet ist.The control section 5038 is connected to an operation display section 5040 which has operation keys to provide various displays from an operator to the device of the present invention, such as a translation display key, a cursor key, etc., and a display or a display which is visual one German / English text entered, a Japanese text as the result of the translation, intermediate data such as dictionary information, various displays for the operator, etc. It can also be implemented such that most of the operating display functions are provided in a keypad when it is arranged on the input section 5010 or in a display when it is arranged on the output section 5032 .

In Fig. 37 sind detaillierte Strukturen bezüglich der Verarbeitung von Eigennamen in dem Abschnitt 5016 für eine morphologische Analyse dargestellt. Der Abschnitt 5016 ist für den Teil dargestellt, welcher direkte Beziehungen zu dem Verständnis der Erfindung hat, obwohl es natürlich auch andere die morphologische Analyse betreffende Funktionsabschnitte gibt. Die morphologische Analyse wird dadurch durchgeführt, daß das Wörterbuch-Abrufen von der Oberseite der eingegebenen Zeichenfolge nacheinander entsprechend der Abrufschlüssel-Zeichenfolge befohlen wird und die Verarbeitung für die dadurch erhaltene Wörterbuch-Information von dem Wörterbuch-Abrufabschnitt 5104 gemäß der Positionsinformation des Eigennamens durchgeführt wird, was später noch beschrieben wird.In Fig. 37, detailed structures of the processing of proper names in the portion 5016 are shown for a morphological analysis on. Section 5016 is shown for the part which is directly related to the understanding of the invention, although there are of course other functional sections relating to morphological analysis. The morphological analysis is performed by commanding dictionary retrieval from the top of the input string one by one in accordance with the retrieval key string, and processing for the dictionary information thereby obtained is performed by dictionary retrieving section 5104 in accordance with the position information of the proper name, which will be described later.

Der Abschnitt 5016 hat einen Eingabe-Verarbeitungsabschnitt 5100 zum Aufnehmen der Daten für die von dem Abschnitt 5014 eingegebene Zeichenfolge und zum Durchführen der Eingabeverarbeitung. Der Abschnitt 5100 ist mit einem Puffer für die eingegebene Zeichenfolge versehen, welcher mit den deutschen/englischen Zeichenfolgedaten in Form von Kodedaten, beispielsweise ASCII versorgt wird, und speichert vorübergehend die Zeichenfolgedaten.Section 5016 has an input processing section 5100 for receiving the data for the string input from section 5014 and performing the input processing. The section 5100 is provided with a buffer for the input string, which is supplied with the German / English string data in the form of code data, for example ASCII, and temporarily stores the string data.

Die Daten für eine eingegebene Zeichenfolge, die vorübergehend in dem Abschnitt 5100 gespeichert ist, werden an einen Abschnitt 5102 abgegeben, welcher die Daten in Wörterbuch- Bezugseinheiten, wie beispielsweise Worte, aufteilt. Der Abschnitt 5102 ist ein Funktionsabschnitt, welcher die Wörterbuchs-Bezugseinheiten unterscheiden, welche die Abrufschlüssel-Zeichenfolgen darstellen, um nacheinander das Wörterbuch 5018 in dem Abschnitt 5104 abzurufen. Das Wörterbuch-Bezugsabgrenzungen, die bei dem Aufteilen für die Wörterbuch-Bezugseinheiten verwendet werden, werden an der jeweiligen Stelle des deutschen/englischen Zeichens als Zahl, Apostroph, Zeichen außer Bindestrich und Punkt sowie als Apostroph, welcher auf ein Leerzeichen folgt, angeordnet. Sie werden in einer Abgrenzungstabelle 5108 gespeichert und auf sie wird beim Aufteilen der Wörterbuch- Bezugseinheiten in dem Abschnitt 5102 Bezug genommen.The data for an input string that is temporarily stored in section 5100 is provided to section 5102 , which divides the data into dictionary reference units, such as words. Section 5102 is a functional section that distinguishes the dictionary reference units that represent the retrieval key strings to successively retrieve dictionary 5018 in section 5104 . The dictionary reference boundaries, which are used in the division for the dictionary reference units, are arranged at the respective position of the German / English character as a number, apostrophe, characters other than hyphen and period and as an apostrophe which follows a space. They are stored in a delimitation table 5108 and are referenced when the dictionary reference units are divided in section 5102 .

Das Bezugswörterbuch 5018 speichert insbesondere die Information zum Abrufen der Aufteileinheit. Beispielsweise ist, wie in Fig. 38 für das Beispiel der Eingangsinformation dargestellt ist, Grammatikinformation, wie ein Satzteil für jede der Wörterbuch-Bezugseinheiten beispielsweise ein Eingang für ein Wort, enthalten. Die Sprachteil-Information enthält für das Hauptwort einen Hinweis, wenn es ein allgemeines Hauptwort oder ein Eigenname ist. Für den Eigennamen wird ein Unterscheidungshinweis, welcher den Weg aufzeigt, die Position in dem Satz zu begrenzen, d. h. eine Positionsinformation für den Eigennamen gespeichert. Dies wird später noch im einzelnen beschrieben. Es werden auch als andere Informationen beispielsweise eine Abzählbarkeit oder Nicht-Abzählbarkeit eines Hauptworts, eine Unterscheidung, wie instransitiv oder transives Verb, deren Übersetzung usw. registriert. The reference dictionary 5018 particularly stores the information for retrieving the partitioning unit. For example, as shown in Fig. 38 for the example of the input information, grammar information such as a sentence for each of the dictionary reference units is included, for example, an input for a word. The language part information contains a hint for the noun if it is a general noun or a proper name. For the proper name, a distinctive note which shows the way to limit the position in the sentence, ie position information for the proper name, is stored. This will be described in detail later. Other items of information include, for example, a countability or non-countability of a noun, a distinction such as an intransitive or transitive verb, its translation, etc.

Es gibt vier Arten von Positionsinformation für den Eigennamen, d. h. in der vorliegenden Ausführungsform mußte "0" bis "3". Das Muster "0" zeigt einen Eigennamen ohne Positionsbeschränkung, beispielsweise "Stadt/City" oder einen personennamen "Walter" an. Das Muster "1" zeigt an, daß es ein Eigenname, beispielsweise "Mr." ist, der am oberen Ende eines einzigen Eigennamens angeordnet ist, oder eine Folge einer Anzahl von Eigennamen, d. h. einen Eigennamen, der in einer einzigen Gruppe von Eigennamen angeordet ist. Das Muster "2" zeigt an, daß es ein Eigenname ist, beispielsweise "Bahnhof/Station" oder "Bucht/Bay", die am Ende eines einzelnen Eigennamens angeordnet sind, oder daß es ein Eigenname ist, der in einem Wort als eine Gruppe von Eigennamen angeordnet ist, oder welcher anders als das Muster "3", das nachstehend beschrieben wird. Das Muster "3" zeigt an, daß es ein Eigenname ist, beispielsweise "River" in "the Sumida River", was dasselbe wie das Muster "2" ist, jedoch von einem bestimmten Artikel "the/der" am Anfang eines Eigennamens der in einem Eigennamen als eine Gruppe von Eigennamen angeordnet ist.There are four types of position information for the proper name, d. H. in the present embodiment, "0" to 3". The pattern "0" shows a proper name without position restriction, for example "Stadt / City" or one personal name "Walter". The pattern "1" indicates that it's a proper name, for example "Mr." is the one at the top End of a single proper name is arranged, or one Sequence of a number of proper names, i.e. H. a proper name, which is arranged in a single group of proper names. The pattern "2" indicates that it is a proper name, for example "Bahnhof / Station" or "Bucht / Bay", which end of a single proper name are arranged, or that it is a Proper name is that in a word as a group of proper names is arranged, or which other than the pattern "3", which will be described below. The pattern "3" shows assumes that it is a proper name, for example "River" in "the Sumida River" which is the same as pattern "2" however, from a certain article "the / the" at the beginning of a Proper name of the in a proper name as a group of Proper name is arranged.

Der Wörterbuch-Abrufabschnitt 5104 ist ein Funktionsabschnitt, welcher eine Information durch Abrufen des Wort- Wörterbuchs 5013 abgibt, was auf der Abrufschlüssel-Zeichenfolge basiert, welche in den Abschnitt 5102 eingegeben worden ist, und überträgt dasselbe an die Wörterbuch-Informations- Konservierungstabelle 5124, an den Positionsinformations- Verarbeitungsabschnitt 5110 und an den Abschnitt 5111, welcher das vorhergehende Satzende beurteilt. Die Verarbeitung aufgrund der Muster "0" bis "3" entsprechen der Positionsinformation für die Eigennamen, die aus dem Wörterbuch 5018 abgerufen worden sind, werden durch Eigennamen- Verarbeitungsabschnitte 5114, 5116 und 5118 durchgeführt. Verarbeitungen für die Eigennamen werden mittels des Musters "1" in dem Abschnitt 5114, durch die Muster "2" und "3" in dem Abschnitt 5116 und durch das Muster "0" in den Abschnitten 5118 bzw. 5114 durchgeführt. The dictionary retrieval section 5104 is a functional section which outputs information by retrieving the word dictionary 5013 based on the retrieval key string entered in the section 5102 and transmits the same to the dictionary information preservation table 5124 , to the position information processing section 5110 and to section 5111 which judges the previous sentence end. The processing based on the patterns "0" to "3" corresponds to the position information for the proper names retrieved from the dictionary 5018 are performed by proper name processing sections 5114, 5116 and 5118 . Processes for proper names are performed using the pattern "1" in the section 5114 , the pattern "2" and "3" in the section 5116 and the pattern "0" in the sections 5118 and 5114, respectively.

In dieser Ausführungsform werden Eigennamen kollektiv angeordnet, indem beispielsweise als Schlüssel ein Wort verwendet wird, das einen Teil einer Gruppe von Eigennamen darstellt, die in einem einzigen Eigennamen gruppiert sind, und unterzieht sie einer Positionsbeschränkung, wenn sie als ein Eigenname angeordnet sind. Selbst wenn eine Anzahl Eigennamen kontinuierlich vorhanden ist, können sie in geeigneter Weise zusammen mit dem Kontext ohne eine solche fehlerhafte Zusammenstellung angeordnet werden, indem sie einfach immer als eine einzige Gruppe von Eigennamen angeordnet werden. Die Verarbeitung für diesen Zweck wird in den Abschnitten 5114, 5116 und 5118 durchgeführt.In this embodiment, proper names are arranged collectively, for example, using a word as a key that is part of a group of proper names grouped in a single proper name, and positionally constrained when arranged as a proper name. Even if a number of proper names are continuously present, they can be appropriately arranged together with the context without such an erroneous composition by simply always being arranged as a single group of proper names. Processing for this purpose is performed in sections 5114, 5116 and 5118 .

Eigennamen mit einem gewissen Umfang werden in dem Bezugswörterbuch 5018 registriert. Derartige im Wörterbuch registrierte Eigennamen werden einer grammatikalischen Zergliederung bzw. morphologischen Analyse in dem Abschnitt 5110 und den Abschnitten 5114, 5116 und 5118 unterzogen. Sie bilden einen Verarbeitungsabschnitt für im Wörterbuch registrierte Eigennamen. Die Eigennamen, die nicht in dem Wörterbuch 5018 registriert sind, werden in dem das vorhergehende Satzende beurteilenden Abschnitt 5112 und in dem Abschnitt 5118 zum Verarbeiten eines dem Muster "0" entsprechenden Eigennamens grammatikalisch zergliedert bzw. morphologisch analysiert. Diese bilden den Verarbeitungsabschnitt für nicht im Wörterbuch registrierte Eigennamen.Proper names with a certain scope are registered in the reference dictionary 5018 . Such proper names registered in the dictionary are subjected to a grammatical breakdown or morphological analysis in section 5110 and sections 5114, 5116 and 5118 . They form a processing section for proper names registered in the dictionary. The proper names that are not registered in the dictionary 5018 are parsed or morphologically analyzed in the section 5112 which judges the preceding sentence end and in the section 5118 for processing a proper name corresponding to the pattern "0". These form the processing section for proper names not registered in the dictionary.

Eine Verarbeitung eines Eigennamens wird durch die folgenden zwei Schritte durchgeführt. Zuerst wird ein Eigenname in der eingegebenen Zeichenfolge erkannt. Im Falle eines in dem Wörterbuch 5018 registrierten Wortes erfolgt dies dadurch, daß der Eigenname in dessen Morphem-Betätigungsinformation angezeigt wird. Im Falle eines Wortes, das nicht in dem Wörterbuch 5018 registriert ist, erfolgt dies so, daß ein Zeichen am oberen Ende ein deutsches/englisches großgeschriebenes Zeichen ist, beispielsweise "John" oder "U. S." usw. Processing a proper name is carried out in the following two steps. First a proper name is recognized in the entered string. In the case of a word registered in dictionary 5018 , this is done by displaying the proper name in its morpheme actuation information. In the case of a word that is not registered in the 5018 dictionary, this is done so that a character at the top is a capitalized German / English character, for example "John" or "US" etc.

Dann wird eine Gruppe von Eigennamen kollektiv angeordnet, um den ganzen Teil zu einem einzigen Eigennamen zu machen. Wenn er als ein Eigenname aus der Wörterbuchinformation erkannt wird und wenn die nächste Wörterbuch-Bezugseinheit auch ein Eigenname ist, wird der ganze Teil kollektiv zu einem Eigennamen zusammengesetzt. Beispielsweise wird "M. Weber" als Ganzes als ein Eigennamen analysiert. Das Ergebnis der Analyse bildet einen Kandidaten, um den ideomatischen Ausdruck einschließlich Eigennamen in der lokalen grammatikalischen Zergliederung bzw. Analyse zu gruppieren.Then a group of proper names is arranged collectively, to make the whole part a single proper name. If recognized as a proper name from the dictionary information and when the next dictionary reference unit is also a proper name, the whole part becomes collective composed of a proper name. For example, "M. Weber "as a whole was analyzed as a proper name. The result The analysis forms a candidate for the ideomatic Expression including proper names in the local group grammatical breakdown or analysis.

Dann wird die notwendige lokale Analyse durchgeführt. In diesem Fall werden eine Folge von sogenannten Parsing- Einheiten, welche durch die Morphem-Betätigungsinformation für jede der Parsing-Einheiten betätigt worden sind, kollektiv in einer Parsing-Einheit angeordnet, was auf einer lokalen Parsing-Regel beruht. Beispielsweise wird "Mr. Brown" in "Brown shi" angeordnet. Ferner werden Worte, die einen Teil eines Distriktnamen darstellen auch kollektiv angeordnet. Beispielsweise wird "Lake Biwa" in "Biwako" zusammengestellt. Auf dieselbe Weise werden Worte, die einen Teil eines Gruppennamens darstellen, auch kollektiv angeordnet. Beispielsweise wird "Yale University" als "Yale Daigaku" analysiert.Then the necessary local analysis is carried out. In in this case, a sequence of so-called parsing Units identified by the morpheme actuation information operated for each of the parsing units, collectively arranged in a parsing unit, what's on a local Parsing rule is based. For example, "Mr. Brown" arranged in "brown shi". Furthermore, words become one Represent part of a district name also arranged collectively. For example, "Lake Biwa" is put together in "Biwako". In the same way, words that are part of a group name, also arranged collectively. For example, "Yale University" is called "Yale Daigaku" analyzed.

Im Falle des Eigennamens "Mr. . . ." und "Lake . . ." ist ein Ende im Hinblick auf den Kontext immer gerade davor vorhanden. Wenn folglich "Tom Brown" kollektiv zu einem einzigen Eigennamen zusammengefaßt ist, wird ein Fehler in der folgenden Analyse bewirkt. Beispielsweise folgt auf einen Eigennamen "Universität/University" immer unmittelbar danach ein Ende. Beispielsweise in einem englischen Satz "At Yale University Tom ist . . ." wird erkannt, daß es eine Unterbrechung zwischen "University" und "Tom" gibt. In dieser Ausführungsform wird Information für die Position, an welcher entsprechende Eigennamen in der Folge von Eigennamen einer Positionsbeschränkung unterliegen, in dem Wörterbuch 5018 als die vorstehend beschriebene Positionsinformation gespeichert, d. h. als Muster "0" bis "3". Die kollektive Anordnung mit diesen Positionsinformationen wird in den Verarbeitungsabschnitten 5110, 5112, 5114, 5116 und 5118 durchgeführt. Die Wörterbuchinformation für die eingegebene Zeichenfolge nach der Beendigung dieser Vorgänge wird in dem Puffer für eine abgerufene Wörterbuch-Information, d. h. in der Wörterbuch-Informations-Konservierungstabelle 5124 gespeichert.In the case of the proper name "Mr...." and "Lake..." there is always an end just before that in terms of context. Consequently, if "Tom Brown" is collectively combined into a single proper name, an error is caused in the following analysis. For example, a proper name "Universität / University" always comes to an end immediately afterwards. For example, in an English sentence "At Yale University Tom is ..." it is recognized that there is an interruption between "University" and "Tom". In this embodiment, information for the position at which corresponding proper names in the sequence of proper names are restricted in position is stored in the dictionary 5018 as the above-described position information, that is, as a pattern "0" to "3". The collective arrangement with this position information is carried out in the processing sections 5110, 5112, 5114, 5116 and 5118 . The dictionary information for the input string after the completion of these operations is stored in the retrieved dictionary information buffer, that is, the dictionary information preservation table 5124 .

Das Ergebnis der morphologischen Analyse wird von der Tabelle 5124 an den Abschnitt I 5020 für morphologische Analyse übertragen. Die Verarbeitung durch die Eigennamen- Positionsübertragung wird mit Hilfe der Folge durchgeführt, die in Fig. 40 dargestellt ist. Eine Eingabeverarbeitung wird durchgeführt, indem die Daten für die eingegebenen Zeichenfolge in den Eingabe-Verarbeitungsabschnitt 5100 aufgenommen werden (5200). Dann teilt der Abschnitt 5102 die eingegebene Zeichenfolge in Wörterbuch-Bezugseinheiten zum Abrufen des Wörterbuchs 5018 ein (5201). Der Abschnitt 5104 ruft das Wörterbuch 5018 dementsprechend ab (5203), und wenn es einen Wörterbuch-Eingang (5204) gibt, prüft der dessen Sprachteil (5205). Wenn der Sprachteil kein Eigennamen ist, wird eine Verarbeitung für den Eigennamen in dieser Ausführungsform nicht durchgeführt, da die Wörterbuch-Information in der Tabelle 5124 gespeichert ist (5206). Wenn es ein Eigenname ist, wird die Verarbeitung für den im Wörterbuch registrierten Eigennamen 5207 in dem Abschnitt 5110 und den Abschnitten 5114, 516 und 5118 durchgeführt. Wenn diese Verarbeitungen an der Endposition des Satzes durchgeführt werden, welcher durch die Daten der eingegebenen Zeichenfolge angezeigt ist (5202) wird das Ergebnis der morphologischen Analyse an den hierfür vorgesehenen Abschnitt I 5020 abgegeben (5210).The result of the morphological analysis is transferred from table 5124 to section I 5020 for morphological analysis. The processing by the proper name position transfer is carried out using the sequence shown in FIG. 40. Input processing is performed by including the data for the input string in the input processing section 5100 ( 5200 ). Then section 5102 divides the input string into dictionary reference units to retrieve dictionary 5018 ( 5201 ). Section 5104 retrieves dictionary 5018 accordingly ( 5203 ), and if there is a dictionary entry ( 5204 ), it checks its speech portion ( 5205 ). If the speech part is not a proper name, processing for the proper name is not performed in this embodiment because the dictionary information is stored in table 5124 ( 5206 ). If it is a proper name, processing for dictionary name 5207 registered in the dictionary is performed in section 5110 and sections 5114, 516 and 5118 . If this processing is carried out at the end position of the sentence which is indicated by the data of the input character string ( 5202 ), the result of the morphological analysis is delivered to the section I 5020 provided for this purpose ( 5210 ).

Wenn als Ergebnis der Wörterbuch-Referenz kein Eingang beim Schritt 5204 vorhanden ist, und wenn das Element von einem groß geschriebenen Zeichen startet (5212) wird dies als ein Eigenname erkannt, der nicht dem Wörterbuch registriert ist, und eine Verarbeitung für einen nicht im Wörterbuch registrierten Eigennamen 5213 wird in dem Abschnitt 5112 zum Beurteilen eines vorhergehenden Teils und in dem Eigennamen- Verarbeitungsabschnitt 5118 durchgeführt. Wenn das Anfangszeichen kein groß geschriebenes Zeichen ist, da dies ein in dem Wörterbuch 5018 nicht registriertes Wort ist, wird es als ein nichtregistriertes Wort in der Tabelle 5124 aufbewahrt (5214). Die Verarbeitung wird dann an der Endposition (5202) durchgeführt.If there is no input at step 5204 as a result of the dictionary reference and if the element starts from an uppercase character ( 5212 ), this is recognized as a proper name that is not registered in the dictionary and processing for one not in the dictionary Registered proper names 5213 are performed in the previous part judgment section 5112 and the proper name processing section 5118 . If the initial character is not an uppercase character because this is a word not registered in dictionary 5018 , it is stored as an unregistered word in table 5124 ( 5214 ). Processing is then performed at the end position ( 5202 ).

Die Verarbeitung für den im Wörterbuch registrierten Eigennamen 5207 wird in den Verarbeitungsabschnitten 5110, 5114, 5116 und 5118 in dem in Fig. 41 dargestellten Flußdiagramm durchgeführt. Zuerst wird die erhaltene Positionsinformation, welche in der Wörterbuch-Information enthalten ist, in Bezug gesetzt (52209). Die Eigennamen-Verarbeitung 5221 für das Muster "0" wird durchgeführt, wenn sie das Muster "0" anzeigt; die Eigennamen-Verarbeitung 5220 wird für das Muster "1" durchgeführt, wenn sie das Muster "1" anzeigt, und die Eigennamen-Verarbeitung 5223 wird für das Muster 2,3 durchgeführt, wenn sie Muster "2" bzw. "3" anzeigt. Die Verarbeitung 5221 für das Muster "0" wird in dem Abschnitt 5114 durchgeführt. Die Verarbeitung wird bei einem Eigennamen angewendet, der keine Positionsbeschränkung hat. Zuerst wird, wenn ein Teil, welcher der infrage kommenden Wörterbuch- Bezugseinheit vorausgeht, ein nichtregistrierte Eigennamen ist (5230) der ganze Teil kollektiv in einem Eigennamen angeordnet, wobei die Positionsinformation "1" ist, und wird in der Tabelle 5124 gespeichert (5233). Wenn der vorhergehende Teil ein Eigenname mit Positionsinformation "1" ist 852319 wird die Verarbeitung auf dieselbe Weise durchgeführt.The processing for the proper name 5207 registered in the dictionary is performed in the processing sections 5110, 5114, 5116 and 5118 in the flowchart shown in FIG. 41. First, the obtained position information contained in the dictionary information is related ( 52209 ). Proper name processing 5221 for pattern "0" is performed when it indicates pattern "0"; proper name processing 5220 is performed for pattern "1" if it indicates pattern "1", and proper name processing 5223 is performed for pattern 2,3 if it indicates pattern "2" or "3" . Processing 5221 for pattern "0" is performed in section 5114 . The processing is applied to a proper name that has no position restriction. First, if a part that precedes the dictionary reference unit in question is an unregistered proper name ( 5230 ), the whole part is collectively arranged in a proper name where the position information is "1" and is stored in table 5124 ( 5233 ) . If the previous part is a proper name with position information "1" 852319 , the processing is carried out in the same way.

Wenn der vorhergehende Teil ein Eigenname mit einer Positionsinformation "0" ist 85231) wird der ganze Teil kollektiv zu einem Eigennamen mit einer Positionsinformation "0" zusammengestellt und in der Tabelle 5124 gespeichert (5235). Wenn ferner der vorhergehende Teil kein Eigennamen mit einer Positionsinformation von "0" ist, wird der ganze Teil allein als ein Eigennamen mit einer Positionsinformation "0" in der Tabelle 5124 gespeichert (5134).If the previous part is a proper name with position information "0" 85231 ), the whole part is collectively put together to form a proper name with position information "0" and stored in table 5124 ( 5235 ). Further, if the preceding part is not a proper name with position information of "0", the whole part alone is stored as a proper name with position information "0" in table 5124 ( 5134 ).

Die Eigennamen-Bearbeitung 5222 für das Muster 1 wird so, wie unten beschrieben durchgeführt. Die Bearbeitung wird bei einem Eigennamen angewendet, wie beispielsweise "Mr.", der am Anfang eines einzigen Eigennamens oder am Anfang eines Eigennamens angeordnet ist, der in einer Gruppe als eine Folge einer Anzahl Eigennamen angeordnet ist. Zuerst wird, wenn der Teil, der der infrage kommenden Bezugseinheit vorangeht, ein nichtregistrierter Eigennamen ist (4240), das Wort in nichtregistriert umgewandelt (5241). Wenn es nicht ein nichtregistriertes Wort ist, wird es allein als ein Eigennamen mit der Positionsinformation pos "1" in der Tabelle 5124 gespeichert (5242). Proper name editing 5222 for pattern 1 is performed as described below. Editing is applied to a proper name, such as "Mr.", which is placed at the beginning of a single proper name or at the beginning of a proper name arranged in a group as a sequence of a number of proper names. First, if the portion preceding the relevant reference unit is an unregistered proper name ( 4240 ), the word is converted to unregistered ( 5241 ). If it is not an unregistered word, it is stored alone as a proper name with position information pos "1" in table 5124 ( 5242 ).

Nunmehr wird die Eigennamen-Verarbeitung 5523 für die Muster 2,3 anhand von Fig. 44 beschrieben. Die Verarbeitung wird beispielsweise für einen Eigennamen wie "Bahnhof" ("Station") oder "Fluß" ("River") angewendet, die am Ende eines einzigen Eigennamens angeordnet sind, oder bei einem Eigennamen, der in einer Gruppe als eine Folge einer Anzahl von Eigennamen angeordnet ist. Zuerst wird, wenn der Teil, welcher der infrage kommenden Bezugseinheit vorangeht, ein nicht registrierter Eigennamen ist (5250) dieser Teil kollektiv zusammen mit dem vorhergehenden Wort als ein Eigennamen mit seiner eigenen Positionsinformation pos-self als die Eigennamen-positionsinformation pos zusammengestellt und in der Tabelle 5124 gespeichert (5225). Ferner wird die Verarbeitung auf dieselbe Weise durchgeführt, wenn der vorhergehende Teil ein Eigenname mit der Positionsinformation "1" ist. (5251). The proper name processing 5523 for the patterns 2, 3 will now be described with reference to FIG. 44. The processing is applied, for example, to a proper name such as "station" or "river" arranged at the end of a single proper name, or to a proper name that is in a group as a series of numbers is arranged by proper names. First, if the part preceding the relevant reference unit is an unregistered proper name ( 5250 ), this part is collectively put together with the previous word as a proper name with its own position information pos-self as the proper name position information pos and in that Table 5124 saved ( 5225 ). Further, the processing is carried out in the same manner when the previous part is a proper name with the position information "1". ( 5251 ).

Wenn der vorhergehende Teil kein Eigenname mit Positions- Information "0" ist, (5252) wird er allein in einem Eigennamen mit einer einzigen Positionsinformation pos-self als die Eigennamen-Positionsinformation angeordnet und in der Tabelle 5124 gespeichert (5257).If the previous part is not a proper name with position information "0", ( 5252 ) it is arranged alone in a proper name with a single position information pos-self as the proper name position information and stored in the table 5124 ( 5257 ).

Wenn beim Schritt 5252 der vorhergehende Teil ein Eigenname der Positionsinformation "0" ist, wird dessen eigene Positionsinformation pos-self überprüft (5232) und die Verarbeitung 5255 wird durchgeführt, wenn es das Muster "2" ist. Wenn die eigene Positionsinformation pos-self das Muster "3" ist, wird ferner geprüft, ob das Element vor einer Bezugseinheit "der" ("the") ist oder nicht. Wenn es nicht der bestimmte Artikel "der" ist, wird die Verarbeitung 5255 durchgeführt. Wenn es "der" ist, wird die Gruppe aus "der" mit dem eigenen Element kollektiv als ein Eigennamen mit der Positionsinformation "3" zusammengestellt und in der Tabelle 5124 gespeichert (5256).If at step 5252 the previous part is a proper name of the position information "0", its own position information pos-self is checked ( 5232 ) and processing 5255 is performed if it is the pattern "2". If the own position information pos-self is the pattern "3", it is further checked whether the element in front of a reference unit is "the"("the") or not. If it is not the particular article "that", processing 5255 is performed. If it is "the", the group of "the" with its own element is collectively assembled as a proper name with the position information "3" and stored in the table 5124 ( 5256 ).

Für ein Wort, das mit einem groß geschriebenen Buchstaben beginnt und als ein nicht registriertes Wort erkannt wird, für welches kein Eingang von dem Wörterbuch 1518 als Ergebnis des Abrufvorgangs 5203 vorhanden ist, wird im Flußdiagramm mittels des Schritt 5204 und 5212 auf die Verarbeitung 5213 übergegangen und die Verarbeitung 5213 wird in dem das vorhergehende Satzende beurteilenden Abschnitt 5112 durchgeführt. Zuerst wird, wenn der Teil, welcher der infrage kommenden Bezugseinheit vorangeht, kein Kandidat für das Satzende ist, die Verarbeitung 5221 für das Muster 0, wie oben beschrieben, in dem Verarbeitungsabschnitt 5118 durchgeführt.For a word that begins with an uppercase letter and is recognized as an unregistered word for which there is no input from dictionary 1518 as a result of retrieval 5203 , processing 5213 is performed in steps 5204 and 5212 in the flow diagram and processing 5213 is performed in the previous sentence end judging section 5112 . First, if the part preceding the relevant reference unit is not a candidate for the end of sentence, processing 5221 for pattern 0, as described above, is performed in processing section 5118 .

Der vorhergehende Teil kann ein Kandidat für das Satzende in den folgenden vier Fällen sein. Das erste ist ein Fall, bei welchem ein gesonderter Punkt "." vorhanden ist. Das nächste ist ein Fall, bei welchem der vorhergehende Eingang der Punkt an der letzten Stelle ist, und die Positionsinformation für den Eigennamen nicht 21" ist. Dieser Fall schließt beispielsweise eine Abkürzung, wie "U. S. A." ein. Ferner gibt es den Fall eines Doppelpunkts ":", eines Semikolons ";", eine Folge aus einem Punkt und einem Apostroph ".'" schreiben und eine Folge aus einem Punkt und einem Anführungszeichen "."". Der letzte Fall ist der Fall, daß es sich an der Oberseite des Puffers für eine eingegebene Zeichenfolge befindet.The previous part can be a candidate for the end of the sentence in the following four cases. The first is a case at which a separate point "." is available. The next is a case in which the previous input the point is in the last position, and the position information for the proper name is not 21 ". This case includes, for example, an abbreviation such as "U.S.A." a. There is also the case of a colon ":", a semicolon ";", a sequence of a period and an apostrophe Write ". '" and a sequence of one point and one Quotation marks "." ". The latter case is the case that it is at the top of the buffer for an entered String.

Wenn der vorhergehende Teil einer der vorstehend beschriebenen vier Fälle ist, wird der vorhergehende Kandidat für das Satzende als das Satzende erkannt (5261), und das Wörterbuch wird abgerufen, nachdem der groß geschriebene Buchstabe des Worts in einen kleinen Buchstaben umgewandelt wird (5262). Wenn als Ergebnis des Abrufens ein Wörterbuch- Eingang erhalten wird (5263), wird er in der Tabelle 5124 aufgezeichnet (5264). Wenn nicht wird er als ein nichtregistrierter Eigennamen in der Tabelle 5264 aufgezeichnet, wobei das obere Zeichen unverändert als der groß geschriebene Buchstabe belassen wird (5265).If the previous part is one of the four cases described above, the previous sentence end candidate is recognized as the sentence end ( 5261 ) and the dictionary is retrieved after the uppercase letter of the word is converted to a lowercase letter ( 5262 ). If a dictionary input is obtained as a result of the retrieval ( 5263 ), it is recorded in table 5124 ( 5264 ). If not, it is recorded as an unregistered proper name in table 5264 , leaving the upper character unchanged as the uppercase letter ( 5265 ).

Nunmehr erfolgt die Erläuterung eines weiteren Beispiels. Wenn beispielsweise ein Wörterbuch auf eine eingegebene Zeichenfolge "Entlang des Sumida River gingen Paul and mr. Gold Smith . . ." ("Along the Sumida River Paul and Mr. Gold Smith went . . ." zurückgeführt wird, wird die Wörterbuch- Eingangsinformation zuerst in die Tabelle 5124 geschrieben, wie in Fig. 64A dargestellt ist. Beispielsweise ist für "des" bzw. "the" die Ausgangsposition in dem Satz "7" und die Endposition ist "9", und der Sprachteil ist ein bestimmter Artikel. Kein Eingang kann für das Wort "Entlang" ("Along") an dem oberen Ende der eingegebenen Zeichenreihe bei dem Wörterbuch-Abrufen 5203 erhalten werden, und es wird als nicht registriert festgestellt. Da jedoch der vorhergehende Teil der Bedingung des Kandidaten für das Satzende genügen kann, da es an dem oberen Ende des Eingabepuffers vorhanden ist (5260) wird der große Anfangsbuchstabe "E" bzw. "A" am Anfang in einen kleinen Buchstaben umgewandelt und das Wörterbuch-Abrufen 5262 wird als "entlang" "along" durchgeführt.Another example will now be explained. For example, if a dictionary entered a string "Along the Sumida River, Paul and mr. Gold Smith..."("Along the Sumida River Paul and Mr. Gold Smith went..."), The dictionary input information is first written to table 5124 , as shown in Fig. 64A. For example, for "des" and "the "the starting position in the sentence" 7 "and the ending position is" 9 ", and the language part is a specific article. No entry can be made for the word" Along "at the top of the entered string in the dictionary- Fetch 5203 is obtained and it is determined to be unregistered, however, since the previous part may meet the candidate's end-of-sentence requirement because it exists at the top of the input buffer ( 5260 ), the capital letter "E""A" is initially converted to a lower case letter and dictionary retrieval 5262 is performed as "along""along".

Dann wird der Zeiger inkrementiert, um zu der Verarbeitung für "Sumida" überzugehen. Dies Wort ist in der vorliegenden Ausführungsform nicht in dem Wörterbuch 5018 registriert. Da der vorhergehende Teil kein Kandidat für das Satzende ist, wird der Fluß an die Eigennamen-Verarbeitung 5221 für das Muster 0 übertragen. Wie in Fig. 46A dargestellt, werden ein Eigenname für den Sprachteil und "0" für die Eigennamen- Positionsinformation abgerufen.Then the pointer is incremented to proceed to processing for "Sumida". This word is not registered in dictionary 5018 in the present embodiment. Since the previous part is not a candidate for the end of the sentence, the flow is transferred to proper name processing 5221 for pattern 0. As shown in Fig. 46A, a proper name for the voice part and "0" for the proper name position information are retrieved.

Die nächste Bezugseinheit "Fluß" "River" ist ein Eigenname mit der Positionsinformation "3". Der vorhergehende Teil ist ein Eigenname mit der Positionsinformation "0", und der weitere vorhergehende Teil ist "des" ("the"). Im Hinblick darauf wird der Ausdruck "des Sumida Flusses" ("the Sumida River") kollektiv als ein einziger Eigenname durch die Schritte 5250 bis 5254 zusammengestellt und als die Positionsinformation "3" in der Tabelle 5124 gespeichert (Fig. 46B). Dann ist die nächste Bezugseinheit "P" ein nicht registrierter Eigennamen mit der Positionierinformation "0", an welcher eine Verarbeitung 5213 durchgeführt wird. Obwohl das vorhergehende Wort ein Eigenname ist, da die Positionsinformation hierfür "3" ist, ist dadurch nichts kollektiv zusammengestellt, sondern die Wörterbuch-Information wird so wie ist in der Tabelle 5124 gespeichert (Fig. 46C). Eine gewöhnliche Verarbeitung wird für die nachfolgende Konjunktion "und" angewendet.The next reference unit "river""river" is a proper name with the position information "3". The previous part is a proper name with position information "0", and the other previous part is "des"("the"). In view of this, the expression "the Sumida River" is collectively compiled as a single proper name by steps 5250 to 5254 and stored as the position information "3" in table 5124 ( Fig. 46B). Then the next reference unit "P" is an unregistered proper name with the positioning information "0", on which processing 5213 is carried out. Although the previous word is a proper name because the position information for this is "3", this does not collect anything collectively, but rather the dictionary information is stored as is in table 5124 ( FIG. 46C). Ordinary processing is used for the subsequent conjunction "and".

Das nächste Wort "Mr." ist ein Eigenname mit der Positionsinformation "1", welche so, wie sie ist, in der Tabelle 5124 gespeichert wird (Fig. 46D). Selbst wenn der vorhergehende Teil ein Eigenname "Paul" ist, kann da eine Zeichensetzung zwischen Worten unmittelbar von "Mr." vorhanden ist, dies sowie es ist, in der Tabelle 5124 gespeichert werden. The next word "Mr." is a proper name with position information "1" which is stored as it is in the table 5124 ( Fig. 46D). Even if the previous part is a proper name "Paul", punctuation between words can be directly from "Mr." exists as well as it is stored in table 5124 .

Ferner ist das Wort "Gold" ein Eigenname, der nicht in dem Wörterbuch registriert ist und bei dem die Verarbeitung 5213 angewendet wird. Da das vorhergehende Wort "Mr." eine Positionsinformation "1" hat, werden beide kollektiv zusammengefaßt, und der ganze Teil wird in einem Eigennamen mit der Positionsinformation "1" ausgebildet (Fig. 46E). Dann ist die Verarbeitung bei dem nächsten Wort "Smith" ähnlich "Fig. 46F). Das anschließende Wort "gingen" ("went") ist eine Vergangenheitsform eines Verbs und es wird nachstehend eine übliche grammatikalische Zergliederung bzw. Analyse durchgeführt.Furthermore, the word "gold" is a proper name that is not registered in the dictionary and to which processing 5213 is applied. Since the previous word "Mr." having position information "1", both are collectively combined, and the whole part is formed in a proper name with the position information "1" ( Fig. 46E). Then the processing on the next word "Smith" is similar to " Fig. 46F." The subsequent word "went" is a past tense of a verb and a common grammatical breakdown or analysis is performed below.

Wie vorstehend bezüglich anhand der erfindungsgemäßen Ausführungsform beschrieben ist, werden Eigennamen dadurch angeordnet, daß als Schlüssel ein Wort verwendet wird, das einen Teil einer Gruppe von Eigennamen darstellt, welche kollektiv zu einem einzigen Eigennamen zusammengefaßt sind, und das eine Positionsbeschränkung durchmacht, wenn sie kollektiv in einem Eigennamen zusammengefaßt werden. Selbst wenn eine Anzahl Eigennamen ständig vorhanden ist, ist es auf diese Weise möglich, eine richtige kollektive Zusammenstellung zusammen mit den Kontext ohne eine fehlerhafte kollektive Zusammenstellung durchzuführen, indem sie einfach in einer Gruppe von Eigennamen kopiert werden. In dem vorerwähnten Beispiel werden die Wort "des Sumida Flusses" ("the Sumida river") als eine Gruppe von Eigennamen analysiert, die von dem nachfolgenden Wort "Paul" getrennt sind. Ferner wird auch "Mr. Gold Smith" als eine Gruppe aus Eigennamen analysiert.As above with reference to the inventive Embodiment is described, proper names thereby arranged that a word is used as a key that forms part of a group of proper names, which collectively combined into a single proper name, and that goes through a position restriction when they collectively in a proper name. Self if there is a constant number of proper names, it is in this way possible a correct collective compilation along with the context without a flawed perform collective compilation by simply copied in a group of proper names. In the aforementioned Example the word "of the Sumida river" ("the Sumida river") analyzed as a group of proper names, which are separated from the following word "Paul" are. Furthermore, "Mr. Gold Smith" is featured as a group Proper names analyzed.

Nunmehr wird anhand von Fig. 47 eine sechste Ausführungsform des Sprachanalysators gemäß der Erfindung beschrieben, der bei einer automatischen Übersetzungseinrichtung Englisch/ Deutsch-Japanisch angewendet wird. Diese Ausführungsform hat einen Eingabe-Verarbeitungsabschnitt 6014, und Daten werden in den Abschnitt 6014 von einer Eingabeeinheit 6012 aus eingegeben. Die Eingabeeinheit 6012 weist beispielsweise ein Tastenfeld mit Zeichentasten, wie alphanumerischen und Funktionstasten, eine optische Zeichenleseeinrichtung zum Lesen des auf Papier aufgezeichneten englischen/deutschen Textes und eine Leseeinrichtung für eine Magnetplatte auf.A sixth embodiment of the speech analyzer according to the invention, which is used in an automatic translation device English / German-Japanese, will now be described with reference to FIG. 47. This embodiment has an input processing section 6014 , and data is input to section 6014 from an input unit 6012 . The input unit 6012 has, for example, a keypad with character keys, such as alphanumeric and function keys, an optical character reading device for reading the English / German text recorded on paper and a reading device for a magnetic disk.

Der Eingabeverarbeitungsabschnitt 6014 hat einen Puffer 6014 a für eine eingegebene Zeichenfolge und speichert den von der Einrichtung 6012 eingegebenen englischen/deutschen Satz in dem Puffer 6014 a. Der Abschnitt 6014 liest den in dem Puffer 6014 a gespeicherten Satz aus und gibt ihn an einen Einheiten-Aufteilabschnitt 6016 ab. Der Abschnitt 6016 ist ein Funktionsabschnitt, welcher die Wörterbuch- Bezugseinheit aus dem Eingabesatz von dem Abschnitt 6014 durch das Abrufen einer Abgrenzungstabelle 6018 aufteilt. Die Tabelle 6018 enthält Abgrenzungen, wie Zwischenräume, Kommata, usw.The input processing section 6014 has a buffer 6014 a for an input character string and stores the English / German sentence entered by the device 6012 in the buffer 6014 a . The section 6014 reads out the set stored in the buffer 6014 a and delivers it to a unit division section 6016 . Section 6016 is a functional section that divides the dictionary reference unit from the input set from section 6014 by retrieving a delimitation table 6018 . Table 6018 contains delimitations, such as spaces, commas, etc.

Der Abschnitt 6016 liest die Abgrenzungen aus der Tabelle 6018 aus und teilt den von dem Abschnitt 6014 aus eingegebenen Satz in Zeichenfolgen als die Einheiten zum Abfragen eines Bezugswörterbuchs 6020 auf, indem der Satz in Teile unterteilt wird, wo die Abgrenzungen vorhanden sind. Die aufgeteilten Zeichenfolgen werden in einen Wörterbuch- Abrufabschnitt 6022 eingegeben.Section 6016 reads the boundaries from table 6018 and divides the sentence entered from section 6014 into strings as the units for querying a reference dictionary 6020 by dividing the sentence into parts where the boundaries exist. The split strings are input to a dictionary retrieving section 6022 .

Der Abschnitt 6022 ruft das Bezugswörterbuch 6020 für den eingegebenen Satz ab, der von dem Abschnitt 6016 abgegeben worden ist, der in Bezugseinheiten unterteilt ist. Das Bezugswörterbuch 6020 enthält beispielsweise, wie in Fig. 48 dargestellt, Eingänge für die Zeichenfolgen, deren Sprachteil, Merkmalsinformation usw. des englischen/deutschen Satzes. Das Bezugswörterbuch 6020 enthält zusätzlich zu den in der Fig. dargestellten Eigennamen die Zeichenfolgen für einen anderen Sprachteil, beispielsweise für Verben, Adjektiva, usw. Der Eigenname als Sprachteil in dieser Fig. bedeutet, daß sie in Verbindung mit der registrierten Eigennamenverteilung angewendet werden, was später noch beschrieben wird, aber drückt keinen üblichen grammatikalischen Eigennamen aus. Ferner zeigt die Merkmalsinformation an, was der betreffende Eigenname ausdrückt und ist nicht nur auf einen beschränkt.Section 6022 retrieves the reference dictionary 6020 for the input sentence output from section 6016 , which is divided into reference units. For example, the reference dictionary 6020 contains, as shown in Fig. 48, inputs for the character strings, their language part, feature information, etc. of the English / German sentence. The reference dictionary 6020 contains, in addition to the proper names shown in the figure , the strings for another language part such as verbs, adjectives, etc. The proper name as a language part in this figure means that they are used in connection with the registered proper name distribution, what will be described later, but does not express a common grammatical proper name. Furthermore, the feature information shows what the relevant proper name expresses and is not limited to just one.

Der Abschnitt 6022 ruft das Bezugswörterbuch 6020 für die Zeichenfolge ab, die in die Bezugseinheit unterteilt ist, und wenn die Zeichenfolge ein Eigennamen ist, gibt sie ihn an einen Abschnitt 6024 ab, um die Eigennamen-Verarbeitung durchzuführen, was später beschrieben wird. Ferner wird sie, wenn sie kein Eigennamen ist, an einen Verarbeitungsabschnitt 6036 abgegeben und wird in der Tabelle 6036 a in dem Verarbeitungsabschnitt 6036 gespeichert. Der Abschnitt 6024 weist einen das vorhergehende Satzende verarbeitende Abschnitt 6026, einen den vorhergehenden Eigennamen an sich verarbeitenden Abschnitt 6030 auf.Section 6022 retrieves the reference dictionary 6020 for the string divided into the reference unit, and if the string is a proper name, it passes it to a section 6024 to perform the proper name processing, which will be described later. Furthermore, if it is not a proper name, it is given to a processing section 6036 and is stored in the table 6036 a in the processing section 6036 . Section 6024 has a section 6026 which processes the preceding sentence end, a section 6030 which processes the previous proper name.

In dem Abschnitt 6026 wird beurteilt, ob eine Zeichenfolge, welche der eingegebenen Zeichenfolge vorangeht, welche aus dem Abschnitt 6022 abzurufen ist, am Ende des Satzes liegt oder nicht, und wenn die vorhergehende Zeichenfolge sich am Satzende befindet, wird sie an den Abschnitt 6022 abgegeben, nachdem der groß geschriebene Buchstabe am Anfang der zu verarbeitenden Zeichenfolge in einen kleinen Buchstaben umgewandelt ist und bewirkt dadurch, daß der Abschnitt 6022 in die Zeichenfolge, welche durch das zweite Abrufen nicht abgerufen worden ist, wird als ein nicht registrierter Eigennamen eingestuft, wird an den Verarbeitungsabschnitt 6036 abgegeben und in dessen Tabelle 6036 a aufbewahrt. Wenn die der eingegebenen Zeichenfolge vorausgehende Zeichenfolge sich nicht am Satzende befindet, wird sie an den Verarbeitungsabschnitt 6036 als ein Eigenname abgegeben, dessen Merkmalsinformation unbekannt ist, und wird in der Tabelle 6036 a registriert, wie später noch beschrieben wird. In section 6026 , it is judged whether or not a string preceding the input string to be retrieved from section 6022 is at the end of the sentence, and when the previous string is at the end of the sentence, it is given to section 6022 after the uppercase letter at the beginning of the string to be processed is converted to a lowercase letter, thereby causing section 6022 to be classified as an unregistered proper name in the string that was not retrieved by the second retrieval submitted the processing section 6036 and kept in its table 6036 a . If the character string preceding the input character string is not at the end of the sentence, it is given to the processing section 6036 as a proper name, the feature information of which is unknown, and is registered in the table 6036 a , as will be described later.

Der Abschnitt 6028 zergliedert die Merkmalsinformation für die von dem Abschnitt 6026 abgegebene, vorhergehende Zeichenfolge und gibt das Ergebnis an den Abschnitt 6030 ab. Der den Eigennamen an sich verarbeitende Abschnitt 6030 überprüft die Merkmalsinformation für den zu zergliedernden bzw. zu analysierenden Eigennamen und wie später noch beschrieben wird, wenn die Merkmalsinformation nicht entweder als der Eigennamen oder der vorhergehende Eigenname registriert ist, analysiert sie den Eigenamen und den vorhergehenden Eigennamen mit Hilfe der registrierten Merkmalsinformation des anderen Eigennamen und speichert dieses Ergebnis in der Tabelle 6036 a in dem Verarbeitungsabschnitt 6036.Section 6028 decomposes the feature information for the previous character string output by section 6026 and delivers the result to section 6030 . Section 6030 processing the proper name per se checks the feature information for the proper name to be broken down or analyzed, and as will be described later, if the feature information is not registered as either the proper name or the previous proper name, it analyzes the proper name and the previous proper name with the help of the registered feature information of the other proper name and stores this result in the table 6036 a in the processing section 6036 .

Der Verarbeitungsabschnitt 6036 hat die Wörterbuch-Informations- Konservierungstabelle 6036 a, speichert die von dem Abschnitt 6028 oder 6032 abgegebenen Daten in die Tabelle 6036 a und liest dann auf diese Weise gespeicherten Daten aus und gibt sie an einen eine syntaktische Analyse durchführenden Abschnitt 6038 ab, welcher die Strukturanalyse für den eingegebenen Satz durchführt, sie aus der Tabelle 6036 a ausliest und sie einer morphologischen Analyse unterzieht.The processing section 6036 has the dictionary information preservation table 6036 a, stores the data output from the section 6028 or 6032 in the table 6036 a and then reads out stored data in this way and delivers it to a section 6038 which carries out a parsing , which carries out the structural analysis for the entered sentence , reads it from table 6036 a and subjects it to a morphological analysis.

Die Arbeitsweise der Einrichtung wird unter Bezugnahme auf das in Fig. 49 dargestellte Flußdiagramm erläutert. Zuerst wird ein deutscher/englischer Eingangsatz von der Eingabeeinrichtung 6012 in den Eingabeverarbeitungsschritt 6014 gelesen (6100). Der in dem Abschnitt 6014 eingelesene Eingangssatz wird in dem Puffer 6014 a gespeichert, und der im Puffer 6014 a gespeicherte Eingangssatz wird an den Einheiten- Aufteilabschnitt 6016 ausgelesen.The operation of the device will be explained with reference to the flow chart shown in FIG. 49. First, a German / English input sentence is read from input device 6012 into input processing step 6014 ( 6100 ). The read section 6014 in the input sentence is stored in the buffer 6014 a, and a stored in the buffer 6014 is read out to the input sentence units- Aufteilabschnitt 6016th

Wenn der Eingangssatz eingelesen ist, liest der Abschnitt 6016 Abgrenzungen aus der entsprechenden Tabelle 6018 aus, um eine Aufteilung für die Wörterbuch-Bezugseinheiten durchzuführen (6102). Das heißt, die Zeichenfolgen, welche den eingegebenen Eingangssatz darstellen, werden nacheinander von dessen Anfang an in Abrufschlüssel-Zeichenfolgen als die Einheiten beim Abrufen des Bezugswörterbuchs 6020 aufgeteilt, indem sie an den Stellen geteilt werden, wo die Abgrenzungen, wie beispielsweise Zwischenräume und Doppelpunkte, vorhanden sind. Es wird beurteilt, ob die aufgeteilten Bezugseinheiten, d. h. die Abrufschlüssel-Zeichenfolgen, beendet worden sind oder nicht (6104) und wenn noch eine (nicht beendete) Abruf-Zeichenfolge vorhanden ist, wird diese an den Wörterbuch-Abrufabschnitt 6022 abgegeben.When the input sentence is read in, section 6016 reads delimitations from the corresponding table 6018 to perform a division for the dictionary reference units ( 6102 ). That is, the strings representing the input sentence inputted are sequentially divided into fetch key strings as the units when fetching the reference dictionary 6020 from the beginning thereof by dividing them at the places where the delimitations such as spaces and colons, available. It is judged whether the divided reference units, that is, the retrieval key strings, have ended or not ( 6104 ), and if there is still a (unfinished) retrieval string, it is delivered to the dictionary retrieval section 6022 .

Wenn die Abruf-Zeichenfolge an den Abschnitt 6022 abgegeben ist, ruft dieser (6022) das Bezugswörterbuch 6020 für die Abruf-Zeichenfolge ab (6106). Es wird dann beurteilt, ob die Abruf-Zeichenfolge in dem Eingang des Bezugswörterbuchs 6020 vorhanden ist oder nicht, wie in Fig. 48 dargestellt ist (6108) und wenn es eine Eingabe gibt, wird eine Sprachteil- Information, die in dem Bezugswörterbuch 6020 gespeichert ist, ausgelesen und es wird beurteilt, ob die Abruf-Zeichenfolge ein Eigenname ist oder nicht (6110).When the fetch string is provided to section 6022 , this ( 6022 ) fetches the reference dictionary 6020 for the fetch string ( 6106 ). It is then judged whether or not the retrieval string is present in the input of the reference dictionary 6020 , as shown in Fig. 48 ( 6108 ), and if there is an input, part of speech information stored in the reference dictionary 6020 is stored is read out, and it is judged whether the retrieval string is a proper name or not ( 6110 ).

Wenn die Abruf-Zeichenfolge kein Eigenname ist, gibt der Abschnitt 6022 die aus dem Bezugswörterbuch 6020 ausgelesenen Daten an den Verarbeitungsabschnitt 6036 ab, und speichert sie in der Tabelle 6036 a (6112). Wenn die Daten in der Tabelle 6036 a gespeichert sind, werden ein Eingang, der anzeigt, daß die Daten gespeichert sind, und die Daten für die Abruf-Zeichenfolge, die unmittelbar vorher gespeichert worden sind, von dem Verarbeitungsabschnitt 6036 in den Abschnitt 6016 eingegeben. Folglich wird wieder auf den Schritt 6012 zurückgegangen, und das Aufteilen für Wörterbuchbezugseinheiten wird in dem Absc 99999 00070 552 001000280000000200012000285919988800040 0002003733674 00004 99880hnitt 6016 durchgeführt.If the retrieval string is not a proper name, the section 6022 passes the data read from the reference dictionary 6020 to the processing section 6036 and stores it in the table 6036 a ( 6112 ). When the data is stored in the table 6036 a , an input indicating that the data is stored and the data for the retrieval string which have been stored immediately before are input to the section 6016 by the processing section 6036 . Thus, step 6012 is returned to, and the division for dictionary reference units is performed in section 99999 00070 552 001000280000000200012000285919988800040 0002003733674 00004 99880nitt 6016 .

Wenn beim Schritt 6110 die Abruf-Zeichenfolge ein Eigenname ist, gibt der Abschnitt 6022 den aus dem Bezugswörterbuch 6020 ausgelesenen Eigennamen (der nachstehend der Einfachheit halber als ein Eigenname bezeichnet wird) zusammen mit mit den Daten der vorherigen, von der Tabelle 6036 a aus eingegebenen Abruf-Zeichenfolge mit Hilfe des Abschnitts 6016 an den Wörterbuch-Abfrageabschnitt 6022 und den Abschnitt 6024 ab, wo die Verarbeitung für den im Wörterbuch registrierten Eigennamen durchgeführt wird (6114).If, in step 6110, the retrieval string is a proper name, section 6022 returns the proper name read from reference dictionary 6020 (hereinafter referred to simply as a proper name) along with the data of the previous one entered from table 6036 a Retrieve string using section 6016 to dictionary query section 6022 and section 6024 where processing for the proper name registered in the dictionary is performed ( 6114 ).

Nunmehr wird unter Bezugnahme auf das in Fig. 50 dargestellte Flußdiagramm die Verarbeitung für den in dem Wörterbuch registrierten Eigennamen durchgeführt. Die Daten, die von dem Abschnitt 6022 an den Abschnitt 6024 abgegeben worden sind, werden mittels des das vorhergehende Satzende verarbeitenden Abschnitts 6026 an den den vorhergehenden Eigennamen verarbeitenden Abschnitt 6028 abgegeben. Bei der Verarbeitung des im Wörterbuch registrierten Eigennamens hat der Abschnitt 6026 keine Funktion.Now, with reference to the flowchart shown in Fig. 50, processing for the proper name registered in the dictionary is performed. The data which have been discharged from the section 6022 to the section 6024 are delivered to the proper names previous processing portion 6028 by means of the previous block end processing section 6026th Section 6026 has no function when processing the proper name registered in the dictionary.

In dem Abschnitt 6028 wird dann beurteilt, ob die Abruf- Zeichenfolge die dem Eigennamen vorangeht, ein nicht in dem Bezugswörterbuch 6020 registrierter Eigenname ist oder nicht, d. h. ob es ein Eigenname, welcher der Verarbeitung für den nicht im Wörterbuch registrierten Eigennamen unterworfen ist (6200), ist oder nicht, wie später noch beschrieben wird. Falls es ein nicht registrierter Eigenname ist, wird der gesamte Teil des Eigennamens und der vorhergehende nichtregistrierte Eigennamen als ein Eigenname beurteilt, welcher die Merkmalsinformation eines Eigennamens hat (6202); die Daten werden dann an den Verarbeitungsabschnitt 6036 abgegeben und in dessen Tabelle 6036 a gespeichert (6214).In section 6028 , it is then judged whether or not the retrieval string preceding the proper name is a proper name not registered in the reference dictionary 6020 , that is, whether it is a proper name that is subject to processing for the proper name not registered in the dictionary ( 6200 ), or not, as will be described later. If it is an unregistered proper name, the entire part of the proper name and the previous unregistered proper name are judged as a proper name having the characteristic information of a proper name ( 6202 ); the data is then passed to the processing section 6036 and stored in its table 6036 a ( 6214 ).

Wenn in dem Verarbeitungsabschnitt 6028 die Abruf-Zeichenfolge, welche den Eigennamen vorangeht, als ein nicht registrierter Eigennamen beurteilt wird, wird beim Schritt 6200 beurteilt, ob die Abruf-Zeichenfolge, die dem Eigennamen vorangeht, ein in dem Bezugswörterbuch 6020 registrierter Eigenname ist oder nicht (62049. Wenn die Abruf-Zeichenfolge, welche dem Eigennamen vorangeht, ein registrierter Eigennamen ist, wird beurteilt, ob die Merkmalsinformation des vorangehenden Eigennamens unbekannt ist oder nicht, d. h. ob er nicht in dem Bezugswörterbuch 6020 registriert oder nicht (6206).In the processing section 6028 , if the retrieval string preceding the proper name is judged to be an unregistered proper name, it is judged at step 6200 whether or not the retrieve string preceding the proper name is a proper name registered in the reference dictionary 6020 ( 62049. If the retrieval string preceding the proper name is a registered proper name, it is judged whether or not the feature information of the previous proper name is unknown, that is, whether it is not registered in the reference dictionary 6020 or not ( 6206 ).

Wenn die Merkmalsinformation des vorherigen Eigennamens unbekannt ist, wird im Flußdiagramm auf den Schritt 6202 vorgerückt, wo dann der gesamte Teil des Eigennamens und des vorhergehenden Eigennamens als ein einziger Eigenname betrachtet wird, welcher die Merkmalsinformation des Eigennamens hat (6202); der Verarbeitungsabschnitt 6028 gibt denn Daten von dem Verarbeitungsabschnitt 6036 ab, wo sie in dessen Tabelle 6036 a gespeichert werden (6214).If the feature name of the previous proper name is unknown, the flowchart advances to step 6202 where the entire portion of the proper name and previous proper name is considered a single proper name that has the feature information of the proper name ( 6202 ); the processing section 6028 then outputs data from the processing section 6036 , where it is stored in its table 6036 a ( 6214 ).

Wenn in dem Verarbeitungsabschnitt 6028 die Merkmalsinformation des vorhergehenden Eigennamens nicht als unbekannt beurteilt wird, das heißt, wenn beurteilt wird, daß sie in dem Bezugswörterbuch 6020 registriert ist, werden die Daten von dem Verarbeitungsabschnitt 6028 an den Abschnitt 6030 abgegeben. In dem Abschnitt 6030 wird dann beurteilt, ob die Merkmalsinformation des Eigennamens bekannt ist oder nicht (6208). Für den Fall, daß die Merkmalsinformation für den Eigennamen unbekannt ist, beurteilt der Abschnitt 6030 den ganzen Teil des Eigennamens und des vorhergehenden Eigennames als einen Eigennamen, welcher die Merkmalsinformation des vorhergehenden Eigennamens hat (6210) und gibt die Daten an den Abschnitt 6036 ab. In dessen Tabelle 6036 a sie aufgezeichnet werden (6214).In the processing section 6028 , if the feature information of the previous proper name is not judged to be unknown, that is, if it is judged that it is registered in the reference dictionary 6020 , the data is released from the processing section 6028 to the section 6030 . It is then judged in section 6030 whether or not the feature information of the proper name is known ( 6208 ). In the event that the characteristic information for the proper name is unknown, the section 6030 judges the whole part of the proper name and the previous proper name as a proper name which has the characteristic information of the previous proper name ( 6210 ) and delivers the data to the section 6036 . In its table 6036 a they are recorded ( 6214 ).

Wenn in dem Abschnitt 6030 festgestellt wird, daß die Merkmalsinformation des Eigennamens nicht unbekannt ist, das heißt, daß sie in dem Bezugswörterbuch 6020 registriert ist, beurteilt der Verarbeitungsabschnitt 6030 den Eigennamen als einen Eigennamen mit einer Merkmalsinformation, welche aus dem Bezugswörterbuch 6020 unabhängig von dem vorhergehenden Eigennamen abgerufen worden ist (6212) und gibt die Daten an den Verarbeitungsabschnitt 6036 ab, in dessen Tabelle 6036 a die Daten dann aufgezeichnet werden (6214). Wenn nunmehr wieder in Fig. 49 keine Abruf-Zeichenfolge an dem Eingang des Bezugswörterbuchs 6020 beim Schritt 6108 vorhanden ist, wird beurteilt, ob das erste Zeichen in der Zeichenfolge ein groß geschriebener Buchstabe ist oder nicht (6116); wenn es kein groß geschriebener Buchstabe ist, beurteilt der Abschnitt 6022 die Abruf-Zeichenfolge als ein nicht registriertes Wort, gibt es an den Abschnitt 6036 ab und speichert es in dessen Tabelle 6036 a (6118).If it is determined in the section 6030 that the feature information of the proper name is not unknown, that is, that it is registered in the reference dictionary 6020 , the processing section 6030 judges the proper name as a proper name with feature information which is extracted from the reference dictionary 6020 independently previous proper name has been retrieved ( 6212 ) and transfers the data to the processing section 6036 , in whose table 6036 a the data are then recorded ( 6214 ). . Turning now back to Figure 49 no retrieval string is present at the input of the reference dictionary 6020 in step 6108, it is judged whether the first character in the string is an uppercase letter or not (6116); if it is not an uppercase letter, section 6022 judges the fetch string as an unregistered word, passes it to section 6036 and stores it in table 6036 a ( 6118 ).

Wenn das erste Zeichen ein groß geschriebener Buchstabe ist, werden die Daten für die Abruf-Zeichenfolge zusammen mit den Daten für die vorherige Abruf-Zeichenfolge von dem Abschnitt 6022 an den Abschnitt 6024 abgegeben, in welchem die Verarbeitung für einen nicht registrierten Eigennamen durchgeführt wird (6120).If the first character is an uppercase letter, the data for the fetch string is given together with the data for the previous fetch string from the section 6022 to the section 6024 in which the processing for an unregistered proper name is performed ( 6120 ).

Nunmehr wird anhand von Fig. 51 die Verarbeitung für einen nicht im Wörterbuch registrierten Eigennamen beschrieben. Die Daten für die Abruf-Zeichenfolge werden zusammen mit den Daten für den vorherigen Eingang, der in der Tabelle aufgezeichnet worden ist, an dem das vorherige Satzende verarbeitenden Abschnitt 6026 abgegeben, wenn beurteilt wird, ob das Ende der vorherigen, in der Tabelle aufgezeichneten Eingabe ein Kanidat für das Satzende ist oder nicht (6300). Diese Beurteilung bezüglich des Kandidaten für das Satzende wird gemacht, um zu beurteilen, ob das Ende der vorherigen, in der Tabelle aufgezeichneten Eingabe ein Kandidat für das Ende des Satzes, beispielsweise ein separater Punkt (.), usw. ist oder nicht.The processing for a proper name not registered in the dictionary will now be described with reference to FIG. 51. The data for the fetch string, together with the data for the previous input recorded in the table, is delivered to the previous sentence end processing section 6026 when judging whether the end of the previous input recorded in the table is a candidate for the end of the sentence or not ( 6300 ). This judgment on the candidate for the end of the sentence is made to judge whether or not the end of the previous entry recorded in the table is a candidate for the end of the sentence, e.g. a separate period (.), Etc.

Wenn das Ende der vorhergehenden, in der Tabelle aufgezeichneten Eingabe ein Kandidat für das Satzende ist, werden Daten von dem Abschnitt 6026 an den den vorhergehenden Eigennamen verarbeitenden Abschnitt 6028 abgegeben, und der Abschnitt 6028 beurteilt dann den vorhergehenden, in der Tabelle aufgezeichneten Eingang als das Satzende (6302) und gibt es an den Wörterbuch-Abrufabschnitt 6022 ab, nachdem der groß geschriebene Buchstabe am Anfang der Abruf-Zeichenfolge in einen kleinen Buchstaben geändert worden ist.If the end of the previous entry recorded in the table is a candidate for the end of the sentence, data is delivered from section 6026 to section 6028 processing the previous proper name, and section 6028 then judges the previous entry recorded in the table as that End of sentence ( 6302 ) and outputs to dictionary retrieval section 6022 after the uppercase letter at the beginning of the retrieval string has been changed to a lowercase letter.

Der Abschnitt 6022 ruft das Bezugswörterbuch 6020 für die Abruf-Zeichenfolge ab, welche für den kleinen Buchstaben wieder ausgebildet worden ist (6304) und beurteilt, ob es ein Eingang in dem Bezugswörterbuch 6020 ist (3603). Wenn ein Eingang vorliegt, gibt der Abschnitt 6022 die aus dem Bezugswörterbuch 6020 abgerufenen Daten an den Verarbeitungsabschnitt 6036 ab und speichert sie in der Tabelle 6036 a (6308). Wenn kein Eingang vorliegt, kehrt der Abschnitt 6022 zu dem ersten Zeichen in der Abruf-Zeichenfolge auf den Großbuchstaben zurück, gibt ihn als einen nichtregistrierten Eigennamen an den Verarbeitungsabschnitt 6036 ab und speichert ihn in der Tabelle 6036 a (6310). Wenn beim Schritt 6300 der Verarbeitungsabschnitt 6026 beurteilt, daß das Ende der vorhergehenden in der Tabelle aufgezeichneten Eingangs nicht ein Kandidat für das Satzende ist, werden die Daten von dem Abschnitt 6026 an den Verarbeitungsabschnitt 6028 abgegeben, und der Abschnitt 6028 beurteilt dann, daß die in der Tabelle aufgezeichnete, vorherige Eingabe nicht das Ende des Satzes ist (6312). Die Daten werden dann von dem Abschnitt 6028 an den Abschnitt 6030 abgegeben, und dieser Abschnitt 6030 beurteilt dann die Abruf-Zeichenfolge als einen Eigennamen, dessen Merkmalsinformation unbekannt ist (6314).Section 6022 retrieves the reference dictionary 6020 for the fetch string that has been refurbished for the lower case letter ( 6304 ) and judges whether it is an input to the reference dictionary 6020 ( 3603 ). If there is an input, section 6022 passes the data retrieved from reference dictionary 6020 to processing section 6036 and stores it in table 6036 a ( 6308 ). If there is no input, section 6022 returns to the first character in the fetch string on upper case, passes it to processing section 6036 as an unregistered proper name, and stores it in table 6036 a ( 6310 ). If at step 6300, the processing section 6026 judges that the end of the previous recorded in the table input is not a candidate for the end of the block, the data is output from the section 6026 to the processing section 6028 and the portion 6028 then judges that the in previous entry recorded in the table is not the end of the sentence ( 6312 ). The data is then passed from section 6028 to section 6030 , and this section 6030 then judges the retrieval string as a proper name whose feature information is unknown ( 6314 ).

Der Abschnitt 6030 kehrt dann zu den Daten in dem Abschnitt 6028 zurück, und die Verarbeitung für den im Wörterbuch registrierten Eigennamen wird in dem Abschnitt 6028 durchgeführt (6316). Die Verarbeitung für den im Wörterbuch registrierten Eigennamen ist dieselbe wie diejenige, welche in Fig. 50 dargestellt ist.Section 6030 then returns to the data in section 6028 and processing for the proper name registered in the dictionary is performed in section 6028 ( 6316 ). The processing for the proper name registered in the dictionary is the same as that shown in Fig. 50.

Wenn nunmehr in Fig. 49 die Wörterbuch-Bezugseinheiten bei dem Schritt 6104 beendet werden, werden die in der Tabelle 6036 a aufgezeichneten Daten von dem Abschnitt 6036 an den Abschnitt 6038 abgegeben (6122), wodurch die morphologische Analyse gemäß dieser Ausführungsform beendet ist. Die Arbeitsweise dieser Ausführungsform, die vorstehend beschrieben worden ist, wird nunmehr anhand eines eingegebenen Satzes erläutert.If now in Fig. 49, the dictionary reference units are completed in step 6104, the given in the table 6036 a recorded data from the section 6036 to the section 6038 (6122), whereby the morphological analysis of this embodiment is terminated in accordance with. The operation of this embodiment, which has been described above, will now be explained using an input sentence.

Bei der Erläuterung wird auf Fig. 52 Bezug genommen, wobei beispielsweise ein Eingangssatz "Im Bahnhof Tokyo Mr. Walter . . ." ("In Tokyo Station Mr. Walter . . .") eingegeben wird. Zuerst wird eine Eingangsverarbeitung 1100 durchgeführt, in dem der Eingangssatz in den Verarbeitungsabschnitt 6014 eingelesen wird. Dann wird die Wörterbuch-Aufteileinheit vorgenommen (6102), indem der eingegebene Satz durch Zwischenräume in entsprechende Worte aufgeteilt wird. Zuerst wird das Bezugswörterbuch 6020 für "Im bzw. In" abgerufen (6106). Es ist kein Eingang für "Im bzw. In" in dem Bezugswörterbuch 6020 vorhanden. Wenn der Schritt auf die Verarbeitung für einen nicht im Wörterbuch registrierten Eigennamen vorgerückt ist, da der vorhergehende Teil als der Anfang des Satzes (Die Oberseite der Datei) erkannt wird, wird "Im bzw. In" im bzw. in" umgewandelt. Da "im bzw. in" einen Eingang in dem Bezugswörterbuch 6020 hat und kein Eigenname ist (6110) werden die aus dem Bezugswörterbuch 6020 abgerufenen Daten in der Tabelle 6036 a aufgezeichnet (6112).In the explanation, reference is made to FIG. 52, where, for example, an input sentence "In Tokyo Mr. Walter station..."("In Tokyo Station Mr. Walter ...") is entered. First, input processing 1100 is performed by reading the input sentence into the processing section 6014 . Then the dictionary division unit is made ( 6102 ) by dividing the input sentence into corresponding words by spaces. First, the reference dictionary 6020 for "Im" is retrieved ( 6106 ). There is no input for "Im" in the reference dictionary 6020 . If the step of processing for a proper name not registered in the dictionary has advanced because the previous part is recognized as the beginning of the sentence (the top of the file), "Im" is converted to "." has an input in the reference dictionary 6020 and is not a proper name ( 6110 ), the data retrieved from the reference dictionary 6020 are recorded in table 6036 a ( 6112 ).

Dann wird das Bezugswörterbuch 6020 bezüglich "Tokyo" abgerufen (6106). Da es keinen Eingang für "Tokyo" in dem Bezugswörterbuch 6020 gibt (6108) und das erste Zeichen ein Großbuchstabe ist (6116), wird eine Verarbeitung für einen nicht im Wörterbuch registrierten Eigennamen durchgeführt (6120). Dann wird auf Fig. 51 vorgerückt. Da der vorhergehende Teil "Im bzw. In" ist und kein Kandidat für das Satzende ist (6300) wird "Im bzw. In" erkannt, daß dies nicht das Satzende ist (6312); "Tokyo" wird als Eigenname erkannt, dessen Merkmalsinformation unbekannt ist (6314) und die Verarbeitung für einen im Wörterbuch registrierten Eigennamen wird durchgeführt (6316). Dann wird auf Fig. 50 vorgerückt. Da das vorhergehende "Im bzw. In" weder ein registrierter Eigenname noch ein nichtregistrierter Eigenname ist (6204) wird "Tokyo" allein als ein Eigenname mit einem eigenen Informationsmerkmal d. h. als ein Eigenname aufgezeichnet, dessen Merkmalsinformation unbekannt ist (6216), dann wird beim nächsten Schritt auf Fig. 49 zurückgegangen, und das Bezugswörterbuch 6020 wird für "Bahnhof bzw. Station" abgerufen (6106). Da es einen Eingang für "Bahnhof bzw. Station" in dem Bezugswörterbuch 6020 vorhanden ist (6108), und dies ein Eigenname ist (6110), wird eine Verarbeitung für einen im Wörterbuch registrierten Eigennamen durchgeführt (6114). Beim nächsten Schritt wird auf Fig. 50 vorgerückt. Da das vorhergehende "Tokyo" ein nichtregistrierter Eigenname ist (6200), wird der gesamte Teil für "Bahnhof Tokyo bzw. Tokyo Station" als ein Eigenname aufgezeichnet, das die Merkmalsinformation "Ort, in Form von "Bahnhof bzw. Station" hat (6206).Then the reference dictionary 6020 related to "Tokyo" is retrieved ( 6106 ). Since there is no input for "Tokyo" in the reference dictionary 6020 ( 6108 ) and the first character is an uppercase letter ( 6116 ), processing for a proper name not registered in the dictionary is performed ( 6120 ). Then advance to Fig. 51. Since the previous part is "Im or In" and is not a candidate for the end of the sentence ( 6300 ), "Im or In" is recognized that this is not the end of the sentence ( 6312 ); "Tokyo" is recognized as a proper name whose feature information is unknown ( 6314 ), and processing for a proper name registered in the dictionary is performed ( 6316 ). Then advance to Fig. 50. Since the preceding "in or in" is neither a registered proper name nor an unregistered proper name ( 6204 ), "Tokyo" is recorded alone as a proper name with its own information feature, ie as a proper name whose feature information is unknown ( 6216 ), then at the next step to Fig. 49 decreased, and the reference dictionary 6020 is retrieved for "station or station" (6106). Since there is an input for "station" in the reference dictionary 6020 ( 6108 ) and this is a proper name ( 6110 ), processing for a proper name registered in the dictionary is performed ( 6114 ). The next step advances to Fig. 50. Since the previous "Tokyo" is an unregistered proper name ( 6200 ), the entire part for "Tokyo Station or Tokyo Station" is recorded as a proper name that has the feature information "location, in the form of" station or station "( 6206 ).

Dann wird das Bezugswörterbuch 6020 in Fig. 49 für "Mr." abgerufen (6016). Da es einen Eingang für "Mr." in dem Bezugswörterbuch 6020 gibt und es ein Eigenname ist (6110), wird eine Verarbeitung für einen im Wörterbuch registrierten Eigennamen durchgeführt (6114). Dann wird beim nächsten Schritt auf Fig. 50 vorgerückt. Der vorhergehende Ausdruck "Bahnhof bzw. Station" ist ein nicht registrierter Eigennamen (6200), aber ein registrierter Eigennamen (6200) und die Merkmalsinformation "Ort" ist nicht unbekannt (6206). Da "Mr." die Merkmalsinformation "nicht unbekannt ist" (6208), wird "Mr." allein als ein Eigenname mit der Merkmalsinformation "Person" registriert (6212).Then the reference dictionary 6020 in Fig. 49 for "Mr." retrieved ( 6016 ). Since there is an entrance for "Mr." in the reference dictionary 6020 and there is a proper name ( 6110 ), processing is performed for a proper name registered in the dictionary ( 6114 ). Then, the next step is to advance to Fig. 50. The previous expression "station" is an unregistered proper name ( 6200 ), but a registered proper name ( 6200 ) and the feature information "location" is not unknown ( 6206 ). Because "Mr." the feature information "is not unknown" ( 6208 ), "Mr." registered alone as a proper name with the feature information "person" ( 6212 ).

Nunmehr wird dann wieder in Fig. 49 das Bezugswörterbuch 6020 für "Walter" abgerufen (6016). Da "Walter" einen Eingang in dem Bezugswörterbuch 6020 hat (6108) und es ein Eigenname ist (6110), wird eine Verarbeitung für einen im Wörterbuch registrierten Eigennamen durchgeführt (6114). Dann wird beim nächsten Schritt auf Punkt 50 vorgerückt. Da das vorhergehende "Mr." ein nicht registrierter Eigenname ist (6200), aber ein registrierter Eigenname (6204) mit der Merkmalsinformation für "Person" nicht unbekannt ist (6202), während die Merkmalsinformation für "Walter" unbekannt ist (6208), werden "Mr. Walter" zusammengesetzt und als ein Eigenname mit der Merkmalsinformation "Person" aufgezeichnet (6210).The reference dictionary 6020 for "Walter" is then called up again in FIG. 49 ( 6016 ). Since "Walter" has an entry in the reference dictionary 6020 ( 6108 ) and it is a proper name ( 6110 ), processing is performed for a proper name registered in the dictionary ( 6114 ). Then move on to step 50 in the next step. Since the previous "Mr." an unregistered proper name is ( 6200 ) but a registered proper name ( 6204 ) with the feature information for "person" is not unknown ( 6202 ), while the feature information for "Walter" is unknown ( 6208 ), "Mr. Walter" is composed and recorded as a proper name with the feature information "person" ( 6210 ).

Wie oben beschrieben, wird in dieser Ausführungsform der deutsche/englische Eingangssatz in Abruf-Zeichenfolgen aufgeteilt, für welche das Bezugswörterbuch 6020 zuerst abgerufen wird. Wenn es einen Eingang in Form eines Eigennamen in dem Bezugswörterbuch 6020 gibt, wird eine Verarbeitung für einen registrierten Eigennamen durchgeführt, wobei der vorhergehende, in der Tabelle aufgezeichnete Eingang in Betracht gezogen wird.As described above, in this embodiment, the German / English input sentence is divided into fetch strings for which the reference dictionary 6020 is fetched first. If there is a proper name input in the reference dictionary 6020 , registered proper name processing is performed taking into account the previous input recorded in the table.

Wenn der vorhergehende, in der Tabelle aufgezeichnete Eingang ein Eigennamen ist, wird die Merkmalsinformation für den vorhergehenden, in der Tabelle aufgezeichneten Eingang und für den aktuellen zu verarbeitenden Eigennamen überprüft. Wenn eines hiervon in der Merkmalsinformation fehlt, wird die Merkmalsinformation des anderen von ihnen vorgesehen, während wenn beide eine Merkmalsinformation haben, werden sie einzeln als Eigennamen mit zugehörenden Merkmalsinformationen erkannt.If the previous input recorded in the table is a proper name, the feature information for the previous input recorded in the table and checked for the current proper name to be processed. If any of these are missing from the feature information, the other's feature information is provided by them, while if both have feature information, they are individually used as proper names with associated characteristic information recognized.

Folglich ist es möglich, einen Eigennamen, der keine Merkmalsinformation hat, in angemessener Weise mit einer Merkmalsinformation zu versehen, und die vorgesehene Merkmalsinformation in geeigneter Weise zu begrenzen. Dies ermöglicht eine wirksamere Analyse in der nachfolgenden Strukturanalyse und eine entsprechende Übersetzung.As a result, it is possible to use a proper name that does not contain any feature information appropriately with feature information to provide, and the intended feature information limit appropriately. this makes possible a more effective analysis in the subsequent structural analysis and a corresponding translation.

Ferner wird für die Zeichenfolge, die nicht in dem Bezugswörterbuch 6020 registriert ist, wenn das erste Zeichen ein Großbuchstaben ist und die vorhergehende Zeichenfolge als das Satzende beurteilt wird, das Bezugswörterbuch 6020 wieder abgerufen, nachdem der Großbuchstabe in einen Kleinbuchstaben geändert worden ist, und folglich ist es möglich, das Bezugswörterbuch 6020 auch für die Zeichenfolge am Anfang des Satzes abzurufen. Wenn ferner eine Zeichenfolge, die mit einem Großbuchstaben beginnt, an einer anderen Stelle, als am Anfang des Satzes erscheint, wird sie als ein Eigenname beurteilt, und die Merkmalsinformation des Eigennamens wird durch einen Eigennamen mit einer registrierten Merkmalsinformation versehen, welche davor oder danach vorhanden ist. Folglich kann ein Eigenname, der nicht in dem Bezugswörterbuch 6020 registriert ist, in gewissem Umfang grammatikalisch zergliedert bzw. analysiert werden.Further, for the character string that is not registered in the reference dictionary 6020 when the first character is an uppercase letter and the previous character string is judged to be the end of the sentence, the reference dictionary 6020 is retrieved after the uppercase letter has been changed to a lowercase letter, and thus it is possible to get the 6020 reference dictionary for the string at the beginning of the sentence. Further, when a character string beginning with an uppercase letter appears in a place other than at the beginning of the sentence, it is judged as a proper name, and the characteristic information of the proper name is provided with a registered characteristic information by a proper name which is present before or after is. As a result, a proper name that is not registered in the reference dictionary 6020 can be parsed or analyzed to some extent.

Nunmehr wird eine siebte Ausführungsform der Erfindung beschrieben. Hierbei ist in Fig. 54 der Gesamtaufbau der siebten Ausführungsform eines Sprachanalysators gemäß der Erfindung dargestellt, welcher bei einer automtischen Übersetzungseinrichtung für Englisch/Japanisch verwendet ist. Diese Ausführungsform hat einen Eingabeeinschnitt 7010, durch welchen ein ins Japanisch zu übersetzender, englischer Text 7012 eingegeben wird. Der Eingabeabschnitt 7010 kann beispielsweise ein Tastenfeld mit Zeichentasten, wie alphanumerischen oder Funktions-Tasten, eine optische Zeichenleseeinrichtung (OCR), welche den auf Papier aufgezeichneten englischen Text liest und/oder eine Dateispeichereinrichtung zum Lesen eines englischen Textes aufweisen, der auf ein Speichermedium, wie eine Magnetplatte aufgezeichnet ist.A seventh embodiment of the invention will now be described. Here, 54 of the overall structure of the seventh embodiment is shown in Fig. A speech analyzer illustrated in accordance with the invention, which is used in a Japanese translation automtischen means for English /. This embodiment has an input notch 7010 through which an English text 7012 to be translated into Japanese is input. The input section 7010 may include , for example, a keypad with character keys such as alphanumeric or function keys, an optical character reader (OCR) that reads the English text recorded on paper, and / or a file storage device for reading an English text stored on a storage medium such as a magnetic disk is recorded.

Der von dem Eingabeabschnitt 7010 eingegebene, englische Text wird in einen Vorredigierabschnitt 7010 eingelesen, in welchem eine Übersetzungs-Vorbehandlung durchgeführt wird. In diesem Fall wird hauptsächlich eine Satzerkennung und eine Behandlung von unbekannten Worten durchgeführt. The input from the input section 7010, English text is read in a Vorredigierabschnitt 7010, in which a translation pretreatment is carried out. In this case, sentence recognition and treatment of unknown words are mainly carried out.

Dies fungiert dann als ein Teil einer morphologischen Analyse. Die vorredigierten englischen Daten werden zusammen mit der bei der Vorredigierung erhaltenen Information in einem Abschnitt 7010 für eine morphologische Analyse übertragen. Der Abschnitt 7010 zergliedert die Morpheme des englischen Satzes, indem er sie durch Abrufen eines Wort-Wörterbuchs 7018 aufteilt, führt verschiedene Arten von Anordnungen oder Zusammenstellungen durch die eine unbekannte Wortverarbeitung, einen Ausdruck für einen Eigennamen, Zeit, Zahl usw. und führt eine Verarbeitung für den ganzen Satz, wie eine Zusatzfrage (tag question) und eine Appositionserkennung durch. Die morphologischen Analyseregeln sind in einer Regeldatei 7036 enthalten.This then acts as part of a morphological analysis. The pre-edited English data, along with the information obtained from the pre-edited, is transmitted in a section 7010 for morphological analysis. Section 7010 breaks down the morphemes of the English sentence by dividing them by retrieving a word dictionary 7018 , performs various types of arrangements or compilations through unknown word processing, a proper name expression, time, number, etc., and performs processing for the whole sentence, such as an additional question (tag question) and an apposition recognition. The morphological analysis rules are contained in a rules file 7036 .

Die englischen Daten, die der morphologischen Analyse unterzogen worden sind, werden zusammen mit der Wörterbuch-Information, welche durch die morphologische Analyse erhalten worden ist, an einen hierfür vorgesehenen Abschnitt I 7020 übertragen. Der Abschnitt 7020 ist ein Funktionsabschnitt, welcher die Oberflächenstruktur-Struktur für den Satz grammatikalisch zergliedert, wobei Grammatikregeln auf englische Daten angewendet werden, und findet alle strukturellen Möglichkeiten. Die in dem Abschnitt I 7020 einer Analyse unterzogenen, englischen Daten werden dann zusammen mit der Analyseinformation an einen der syntaktischen Analyse dienenden Abschnitt II 7020 zugeführt, in welchem eine Lösung auf dem Ergebnis der vorherigen Analyse entsprechend der Oberflächenschicht-Struktur in dem Abschnitt I durch Anwenden einer strukturellen Beschreibung ausgewählt wird. Folglich wird ein plausibler Parsing- oder Analysebaum des englischen Satzes vorbereitet, und die Struktur wird dann gemacht. Die Analyseregeln werden auch in der Regeldatei 7036 gespeichert.The English data which have been subjected to the morphological analysis, together with the dictionary information which has been obtained by the morphological analysis, are transmitted to a section I 7020 provided for this purpose. Section 7020 is a functional section that grammatically breaks down the surface structure structure for the sentence using grammar rules on English data and finds all structural possibilities. The English data subjected to analysis in section I 7020 is then supplied together with the analysis information to section II 7020 serving for syntactic analysis, in which a solution to the result of the previous analysis corresponding to the surface layer structure in section I is applied a structural description is selected. As a result, a plausible parsing or parsing tree of the English sentence is prepared, and the structure is then made. The analysis rules are also stored in the rules file 7036 .

Die englischen Daten werden nach Durchführung der syntaktischen bzw. morphologischen Analyse als die Daten für den Parsing- bzw. Analysebaum an einen Struktur-Umwandlungsabschnitt 7024 übertragen. Dieser (7024) bereitet einen entsprechenden japanischen Strukturbaum aus dem Strukturbaum vor, welcher eine englische Zwischenstruktur ist und wandelt ihn in eine dem japanischen unterliegende Struktur um, aus welcher ein japanischer Satz leicht übersetzt werden kann.The English data is transferred to a structure conversion section 7024 as the data for the parsing or analysis tree after performing the syntactic or morphological analysis. This ( 7024 ) prepares a corresponding Japanese structure tree from the structure tree , which is an English intermediate structure, and converts it into a structure underlying the Japanese, from which a Japanese sentence can be easily translated.

Die Strukturbaum-Daten, welche die dem Japanischen unterliegende, umgewandelte Struktur zeigen, werden an einen Übersetzungsabschnitt 7062 abgegeben, wo ein übersetzter Satz erzeugt wird. Dies ist ein Funktionsabschnitt zum Erzeugen eines japanischen Satzes aus dem japanischen Strukturbaum. Die Daten für das Japanische, die als ein übersetzter Satz ausgebildet sind, d. h. die übersetzten Satzdaten, werden an einen Nachredigierabschnitt 7030 abgegeben. Der Abschnitt 7030 modifiziert die Übersetzung-Satzdaten bezüglich des Wörterbuchs 7018 mit Hilfe von Informationen, die bei der Übersetzung benutzt worden ist, um einen natürlicheren japanischen Satz zu vervollkommnen. Die Daten für den japanischen Satz werden an einen Ausgabeabschnitt 7032 übertragen und dann von diesem aus als ein übersetzter japanischer Satz 7034 abgegeben. Der Abgabeabschnitt 7032 kann beispielsweise ein Drucker, ein Display und/oder eine Speicherdateieinrichtung, wie eine Magnetplatte, aufweisen. Der Fluß einer Reihe von Übersetzungsverarbeitungen wird durch einen Steuerabschnitt 7038 gesteuert, welcher die Steuerung für die gesamte Einrichtung regelt.The structure tree data showing the converted Japanese structure is given to a translation section 7062 , where a translated sentence is generated. This is a functional section for creating a Japanese sentence from the Japanese structure tree. The Japanese data, which is formed as a translated sentence, that is, the translated sentence data, is supplied to a post-editing section 7030 . Section 7030 modifies the translation sentence data related to dictionary 7018 using information that has been used in the translation to perfect a more natural Japanese sentence. The data for the Japanese sentence is transferred to an output section 7032 and then output from there as a translated Japanese sentence 7034 . The dispensing section 7032 may include, for example, a printer, a display, and / or a storage file device such as a magnetic disk. The flow of a series of translation processes is controlled by a control section 7038 which controls the control for the entire facility.

In dem Wort-Wörterbuch 7018 werden in dieser Ausführungsform die Wörterbuchdaten für englische und japanische Worte gespeichert. Es wird ein Vokabular festgelegt, sowie eine verbindende Beziehung, d. h. eine gleichzeitig bestehende Beziehung oder verschiedene Informationen, wie Bedeutungen eines Singular- oder Pluralform, ein Sprachteil, usw. Ferner werden Regeldaten für eine morphologische und syntaktische Analyse in der Regeldatei 7036 gespeichert. Der Steuerabschnitt 7038 ist mit dem Bedienungsanzeigeabschnitt 7040 verbunden, welcher wiederum Bedienungstasten aufweist, um verschiedene Befehle von einem Operator an die Einrichtung abzugeben, beispielsweise eine Übersetzungs-Befehlstaste und eine Cursor-Taste, sowie ein Display oder eine Anzeige aufweist, welche visuell den eingegebenen japanischen Satztext, den japanischen Satz als Ergebnis einer Übersetzung, Zwischendaten wie eine Wörterbuchinformation verschiedene Informationen an den Operator anzeigt.In this embodiment, the dictionary data for English and Japanese words are stored in the word dictionary 7018 . A vocabulary is defined, as well as a connecting relationship, ie a relationship which exists at the same time or various information, such as meanings of a singular or plural form, a language part, etc. Furthermore, rule data for a morphological and syntactic analysis are stored in the rule file 7036 . The control section 7038 is connected to the operation display section 7040 , which in turn has operation keys to issue various commands from an operator to the device, such as a translation command key and a cursor key, as well as a display or a display that visually shows the input Japanese Sentence text, the Japanese sentence as a result of a translation, intermediate data such as dictionary information displays various information to the operator.

Hierbei können die meisten der Bedienungs-Anzeigefunktionen in einem Tastenfeld, wenn es an dem Eingabeabschnitt 7010 angeordnet ist, oder in einer Anzeige enthalten sein, wenn sie an dem Ausgabeabschnitt 7032 angeordnet ist.Here, most of the operation display functions may be included in a keypad when it is disposed on the input section 7010 or in a display when it is disposed on the output section 7032 .

In Fig. 53 sind detaillierte Ausführungen für den eine morphologische Analyse durchführenden Abschnitt 7016 im Hinblick auf die Verarbeitung von Zahlen dargestellt. Diese Teile, die direkt zum Verständnis der Erfindung beitragen, werden dargestellt, obwohl der Abschnitt 7016 für die morphologische Analyse natürlich auch andere funktionelle Abschnitte für eine morphologische Analyse hat. Der Abschnitt 1017 hat einen Eingabeverarbeitungsabschnitt 7100 zum Aufnehmen und Verarbeiten der eingegebenen Zeichenfolgedaten, die von einem Vorredigierabschnitt 7014 eingegeben sind. Der Abschnitt 7100 ist mit einem Puffer versehen, in welchen Daten für die englische Zeichenfolge in Form von Kodedaten, wie ASCII eingegeben worden sind, und speichert vorübergehend die Daten für die Zeichenfolge.In FIG. 53, detailed embodiments for a morphological analysis by leading section 7016 are shown in relation to the processing of numbers. These parts, which contribute directly to the understanding of the invention, are shown, although section 7016 for morphological analysis naturally also has other functional sections for morphological analysis. Section 1017 has an input processing section 7100 for receiving and processing the input string data input from a pre-editing section 7014 . The section 7100 is provided with a buffer in which data for the English character string in the form of code data such as ASCII has been entered, and temporarily stores the data for the character string.

Die eingegebenen Zeichenfolgedaten, die vorübergehend in dem Verarbeitungsabschnitt 7100 gespeichert sind, werden an einen Abschnitt 7102 abgegeben, welcher die Daten in Wörterbuch- Bezugseinheiten, wie beispielsweise Worte, aufteilt. Der Abschnitt 7102 ist ein Funktionsabschnitt, welcher eine Wörterbuch-Bezugseinheit unterscheidet, die nacheinander die Zeichenfolge beim Abrufen des Wörterbuchs 7016 in dem Abschnitt 7106 darstellt. Wörterbuch-Bezugsabgrenzungen, die bei der Aufteilverarbeitung für die Wörterbuch-Bezugseinheit verwendet worden sind, werden an der Stelle eines englischen Zeichens, eines numerischen Zeichens, eines Apostrophs, Zeichen die kein Bindestrich oder Punkt sind, sowie eines Apostrophs angeordnet, welcher auf ein Leerzeichen folgt. Sie werden in einer Abgrenzungstabelle 7104 gespeichert und auf sie wird beim Aufteilen der Wörterbuch-Bezugseinheit in dem Abschnitt 7108 Bezug genommen.The input string data temporarily stored in the processing section 7100 is output to a section 7102 which divides the data into dictionary reference units such as words. Section 7102 is a functional section that distinguishes a dictionary reference unit that sequentially represents the string when dictionary 7016 is retrieved in section 7106 . Dictionary reference boundaries that have been used in the division processing for the dictionary reference unit are arranged in place of an English character, a numeric character, an apostrophe, characters that are not a hyphen or period, and an apostrophe that follows a space . They are stored in a delimitation table 7104 and are referred to in section 7108 when the dictionary reference unit is split.

Das Wörterbuch 7018 enthält Information insbesondere zum Abrufen der aufgeteilten Einheit. Ferner speichert das Wörterbuch 7018 eine Morpheme verarbeitende Information, wie den Namen eines Monats, eines Wochentags, einer Kardinalzahl, welche nur einen Zahlenwert darstellt, einer Ordnungszahl, einer Einheit zum Ausdrücken von Gramm u. ä. Zeit, the, of, Komma (,) . (.) usw.The dictionary 7018 contains information, in particular for retrieving the split unit. The dictionary 7018 also stores morpheme-processing information such as the name of a month, a day of the week, a cardinal number which is only a numerical value, an ordinal number, a unit for expressing grams and the like. Time, the, of, comma (,). (.) etc.

Der Wörterbuch-Abrufabschnitt 7106 ist ein Funktionsabschnitt, welcher das Wörterbuch 7018 abruft, um die Wörterbuchinformation herauszunehmen, welche auf der Zeichenfolge basiert, welche von dem Abschnitt 7102 eingegeben worden ist, und überträgt dieselbe an einen eine Morpheme- Verarbeitungsinformation schaffenden Abschnitt 7108. Der Abschnitt 7108 enthält eine eine Morpheme-Verarbeitung schaffende Information (siehe Fig. 56), die anzeigt, daß eine Zeichenfolge mit einem morphemischen Merkmal eine Zeitbedeutung hat, wie Stunde, Jahr, Monat usw. wobei eine weitere spezifische Information in der erkannten Zeichenfolge vorgesehen ist, um eine Kardinalzahl oder eine zeitliche Bedeutung in dem Wörterbuch-Abrufabschnitt 7016 zu erhalten. Beispielsweise ist eine derartige Information geschaffen, daß eine numerische Fig. d. h. eine Zahl "ja" bedeutet. Die Zeichenfolge, die mit einer Information in dem Abschnitt 7108 vorgesehen ist, wird ferner bei einer notwendigen lokalen grammatikalischen Zergliederung bzw. Analyse angewendet. The dictionary retrieving section 7106 is a functional section that retrieves the dictionary 7018 to take out the dictionary information based on the character string input from the section 7102 and transmits it to a morpheme processing information creating section 7108 . Section 7108 contains morphemic processing information (see Fig. 56) indicating that a string with a morphemic feature has a time meaning such as hour, year, month, etc., with further specific information provided in the recognized string to obtain a cardinal number or temporal meaning in the dictionary retrieving section 7016 . For example, such information is created that a numerical figure, ie a number, means "yes". The character string provided with information in section 7108 is also used in the case of a necessary local grammatical decomposition or analysis.

In diesem Fall wird eine Einheitengruppe der Wörterbuch- Bezugseinheit, wie beispielsweise ein Wort, das durch die Morpheme-Betätigungsinformation betätigt worden ist, mit Hilfe der lokalen grammatikalischen Zergliederungs- bzw. Parsing-Regeln kollektiv zusammengesetzt. Beispielsweise "Monatsname", "numerischer Ausdruck" werden in "Monatsname + numerischer Ausdruck" zusammengesetzt, d. h. "Okt." und "18" werden gruppiert zu "Okt". 18". Außerdem wird auch eine kollektive Zusammenstellung gemacht, wie beispielsweise "November the 2nd" für "Monatsname + the + numerischer Ausdruck", "22 March" für "numerischen Ausdruck + Monatsname", "the 23rd May" für "the + numerischer Ausdruck + Monatsname", "the 11th of June" für "the + numerischer Ausdruck + of +Monatsname", "′86, Jan. 27. Mon." für "Jahr +, plus Monat und Tag +,+ Wochentag", Suday 26, Jan., 1986" für Wochentag +,+ Monat und Tag +,+ Jahr", "11 : 30 a. m." für "Zahl : Zahl + a. m. (oder p. m.)" oder "Monatsname + Jahr" "Monatsname + of + Jahr" usw.In this case, a unit group of the dictionary Reference unit, such as a word that is represented by the Morpheme actuation information has been actuated with Help from the local grammatical dissection or Parsing rules put together collectively. For example "Month Name", "Numeric Expression" are in "Month Name + numeric expression "compound, i.e." October "and "18" are grouped to "oct". 18 ". Also a collective compilation made, such as "November the 2nd" for "month name + the + numeric Expression "," 22 March "for" numeric expression + month name ", "the 23rd May" for "the + numeric expression + Month name "," the 11th of June "for" the + numeric expression + of + month name "," ′86, Jan. 27. Mon. "for" year +, plus month and day +, + day of the week ", Suday Jan 26, 1986 "for weekday +, + month and day +, + year", "11:30 am." for "number: number + a. m. (or p. m.)" or "Month name + year" "Month name + of + year" etc.

Die Verarbeitung für die lokale Analyse wird in einem den Anfangswert einstellenden Abschnitt 7110, einem Anpassungsabrufeabschnitt 7112, einem Einheiten-Aufteilabschnitt 7114, einem eine Morpheme-Verarbeitungsinformation schaffenden Abschnitt 7111, Abfrageabschnitten 7116 und 7120 und Verarbeitungsabschnitten 7122 und 7124 sowie einer Anpassungstabelle 7128 durchgeführt, die eine Morpheme-Verarbeitungs- Anzeigetabelle enthält, welches eine Unterscheidungs-Bezugstabelle ist, um zu unterscheiden, daß eine Einheitenfolge, die eine Zahl und Zeitfaktoren aufzeigt, wie in Fig. 57 dargestellt, eine zusammengesetzte Einheit aus Zeitfaktoren nach bestimmten Regeln ist. Der Abschnitt 7110 setzt den Anfangswert eines Zählers n, welcher die Anzahl von Wörterbuch- Bezugseinheiten zählt, um beim Abruf eine Folge von Wörterbuch- Bezugseinheiten als eine oben beschriebene Einheitengruppe in dem Anpassungs-Abfrageabschnitt 7112 anzupassen. The processing for the local analysis is performed in an initial value setting section 7110 , an adjustment fetch section 7112 , a unit division section 7114 , a morpheme processing information generating section 7111 , query sections 7116 and 7120 and processing sections 7122 and 7124, and an adaptation table 7128 includes a morpheme processing display table which is a discrimination reference table to discriminate that a unit sequence showing a number and time factors as shown in Fig. 57 is a composite unit of time factors according to certain rules. The section 7110 sets the initial value of a counter n, which counts the number of dictionary reference units, to match a series of dictionary reference units in the matching query section 7112 when retrieved as a unit group described above.

Der Abschnitt 7112 ruft die Anpassungstabelle 7128 für jede der Wörterbuch-Bezugseinheiten auf, um eine Anpassung durchzuführen. Der Einheiten-Aufteilabschnitt 7114 unterscheidet die Wörterbuch-Bezugseinheiten, die als "p" angenommen sind, was mit dem Wörterbuch-Abfragen in dem Abschnitt 7106 von den Wörterbuch-Bezugseinheiten aus vervollständigt worden ist, welche Zeichenfolgen darstellen, nachdem die Wörterbuch- Bezugseinheiten mit Hilfe des Wörterbuch-Abfragens durch den Zähler n vervollständigt worden sind.Section 7112 calls the adjustment table 7128 for each of the dictionary reference units to perform an adjustment. The unit division section 7114 distinguishes the dictionary reference units which are assumed to be "p" , which has been completed with the dictionary query in section 7106 from the dictionary reference units which represent strings after the dictionary reference units with the help of the dictionary query have been completed by counter n .

Der Abfrageabschnitt 7116 ist ein Funktionsabschnitt, welcher eine ähnliche Funktion wie der Wörterbuch-Abrufabschnitt 6106 hat, welcher das Wörterbuch 7018 abruft, um die Wörterbuchinformation herauszunehmen, welche auf der Zeichenfolge beruht, welche in dem Abschnitt 7114 unterschieden worden ist, und dieselbe an den Morpheme-Verarbeitungsinformation schaffenden Abschnitt 7118 überträgt. Der Abschnitt 7118 hat dieselbe Funktion wie der Abschnitt 7108, wobei eine weitere spezifische Information bei jeder Information vorgesehen wird, welche als eine Ordnungszahl oder ein Zeitfaktor in dem Abfrageabschnitt 7116 erkannt worden ist.The query section 7116 is a functional section which has a similar function to the dictionary retrieving section 6106 which retrieves the dictionary 7018 to take out the dictionary information based on the character string which has been distinguished in the section 7114 and the same on the morphemes Processing section 7118 . Section 7118 has the same function as section 7108 , with further specific information being provided for each piece of information which has been recognized as an ordinal number or a time factor in the query section 7116 .

Der Abfrageabschnitt 7120 und die Verarbeitungsabschnitt 7122 und 7124 setzen gemeinsam eine Folge von Wörterbuch- Bezugseinheiten bis zu "p +n", was von dem Anpassungsabrufabschnitt 7112 durch die Verarbeitung in dem Abschnitt 7118 erhalten worden ist, zu einer Wörterbuch-Bezugseinheit zusammen. Dann wird das Ergebnis in der Tabelle 7126 gespeichert, welches ein Puffer zum Speichern der Wörterbuch- Information ist, welche durch das Abrufen vervollständigt worden ist. Das Ergebnis der morphologischen Analyse wird von der Tabelle 7126 an den Abschnitt I 7020 für syntaktische Analyse übertragen.The query section 7120 and the processing section 7122 and 7124 collectively assemble a series of dictionary reference units up to " p + n", which has been obtained from the adaptation retrieval section 7112 by the processing in the section 7118 , into a dictionary reference unit. Then the result is stored in table 7126 , which is a buffer for storing the dictionary information which has been completed by the retrieval. The result of the morphological analysis is transferred from table 7126 to section I 7020 for syntactic analysis.

Nunmehr wird das kollektive Zusammensetzen mittels der Morpheme- Verarbeitungsinformation gemäß der Erfindung anhand dem in Fig. 55A und 55B dargestellten Flußdiagrammen erläutert. Beispielsweise soll die folgende Zeichenfolge in den Eingabeverarbeitungsabschnitt 7100 eingegeben sein (7300).
Eingegebene Zeichenfolge:. . "26 Jan.,′80 he . ."
Der Abschnitt 7102 teilt die eingegebene Zeichenfolge durch die Wörterbuch-Bezugseinheit für ein Abrufen des Wörterbuchs 7018 auf (7302) "26" in der eingegebenen Zeichenfolge wird durch das Abtrennen der Wörterbuch-Bezugseinheit als Einheit abgetrennt. Es wird nun beurteilt, ob der abgetrennte Teil der Bezugseinheit für die eingegebene Zeichenfolge beendet worden ist oder nicht; wenn sie beendet ist, wird die Operation beendet (7304), während, wenn sie es noch nicht ist, wird im Flußdiagramm auf den folgenden Schritt 7306 vorgerückt.The collective assembling by means of the morpheme processing information according to the invention will now be explained with reference to the flowcharts shown in Figs. 55A and 55B. For example, the following string should be entered in the input processing section 7100 ( 7300 ).
Entered string :. . "26 Jan., '80 he ..."
Section 7102 divides the input string by the dictionary reference unit for retrieving dictionary 7018 ( 7302 ) "26" in the input string is separated by separating the dictionary reference unit as a unit. It is now judged whether the separated part of the reference unit for the input character string has ended or not; if it is finished, the operation is ended ( 7304 ), while if it is not, the flowchart advances to the following step 7306 .

Das Wörterbuch 7018 wird für "26" in der eingegebenen Zeichenfolge abgerufen, um die Wörterbuch-Information herauszunehmen, die anzeigt, daß "26" eine "Zahl" ist (7306). Dann zeigt eine Morpheme-Verarbeitungsinformation an, daß "Zahl, Zahl" ein morphemes Merkmal ist, d. h. es ist eine Folge von Zahlen und wird als eine gruppierte Kardinalzahl behandelt (7308). Es wird nun beurteilt, ob die Gruppe, welche die Wörterbuch-Information erhält, beim Schritt 7308 mit der Morphemen-Verarbeitungsinformation versehen worden ist oder nicht (7130). Wenn sie versehen worden ist, wird im Flußdiagramm auf den Schritt 7314 weitergegangen, um eine weitere Verarbeitung auf der Basis der lokalen Analyseregel anzuwenden, wobei keine vorgesehen Gruppe in der Tabelle 7126 aufgezeichnet wird (7312); im Flußdiagramm wird dann auf den Schritt 7302 zurückgekehrt. Folglich wird beim Schritt 7314 auf "26" vorgerückt, da diese Zahl mit der Morphemen- Verarbeitungsinformation versehen ist. Die Verarbeitung beim Schritt 7314 wird entsprechend der Arbeitsweise des in Fig. 55B dargestellten Flußdiagramms durchgeführt.Dictionary 7018 is retrieved for "26" in the input string to take out dictionary information indicating that "26" is a "number" ( 7306 ). Then morpheme processing information indicates that "number, number" is a morphemic feature, that is, it is a sequence of numbers and is treated as a grouped cardinal number ( 7308 ). It is now judged whether or not the group receiving the dictionary information has been provided with the morpheme processing information at step 7308 ( 7130 ). If it has been provided, the flowchart proceeds to step 7314 to apply further processing based on the local analysis rule, and no designated group is recorded in table 7126 ( 7312 ); the flowchart then returns to step 7302 . Thus, step 7314 advances to "26" because this number is provided with the morpheme processing information. The processing at step 7314 is performed according to the operation of the flow chart shown in Fig. 55B.

Zuerst wird ein Anfangswert "0" in einem Zähler n gesetzt, welcher die Anzahl Anpassungs-Bezugseinheiten zählt, wenn die Bezugseinheiten in dem Anpassungs-Abrufabschnitt 7112 abgerufen werden (7410). Da ferner die Bezugseinheit, die durch das Wörterbuch-Abrufen im Abschnitt 7010 vervollständigt worden ist, als "p" gesetzt wird, wird die Anpassungstabelle 7128 für die p + n _te (n=0) Bezugseinheit, d. h. "26", durch den Abschnitt 7112 abgerufen (7412). Da "26" mit Hilfe der Morphemen-Verarbeitungsinformation versehen worden ist, die beim Schritt 7308 anzeigt, daß es eine Ordnungszahl ist, und da diese Zusammenstellungen "jede mit einer Ordnungszahl" an deren oberen Ende an und nach dem zweiten Ausdruck die Zusammenstellung in der Anpassungstabelle 7128 vorhanden sind (siehe Fig. 57), ist die Bezugseinheit "26" gleich der Information der Anpassungstabelle 7128 und folglich bezüglich dieser angepaßt. In diesem Fall wird eine Anpassung für Ms-Me durchgeführt, während die zweite angepaßte Zusammenstellung als "Ms" gesetzt ist, wobei die letzten Daten der Kombination die "Ordnungszahl" an deren oberen Stelle als "me" in der Anpassungstabelle 7128 haben.First, an initial value "0" is set in a counter n , which counts the number of adjustment reference units when the reference units are retrieved in the adjustment retrieving section 7112 ( 7410 ). Further, since the reference unit completed by dictionary retrieval in section 7010 is set as "p" , the adjustment table 7128 for the p + n _th (n = 0) reference unit, ie, "26", is set by the section 7112 accessed ( 7412 ). Since "26" has been provided with the morphemic processing information which indicates that it is an atomic number at step 7308 , and since these compilations "each with an atomic number" at the top end and after the second expression, the compilation in the Adaptation table 7128 are present (see FIG. 57), the reference unit "26" is equal to the information in the adaptation table 7128 and consequently adapted with respect to it. In this case, an adjustment for Ms-Me is performed while the second adjusted compilation is set as "Ms", with the last data of the combination having the "ordinal number" at its top as "me" in the adjustment table 7128 .

Basierend auf dem Anpassungsergebnis in der Tabelle 7128 bei einer (p + n _te (n=0) Wörterbuch-Bezugseinheit wird der Anpassungszustand beurteilt (7414) und wenn festgestellt wird, daß er angepaßt ist, wird im Flußdiagramm auf den Schritt 7416 vorgerückt, während, wenn festgestellt wird, daß er nicht angepaßt ist, wird im Flußdiagramm auf den Schritt 7424 vorgerückt.Based on the fit result in table 7128 at a (p + n _te (n = 0) dictionary reference unit, the fit state is judged ( 7414 ) and if it is determined that it is fit, the flowchart advances to step 7416 while if it is determined that it is not matched, the flowchart proceeds to step 7424 .

Wenn beurteilt wird, daß anzupassen ist, wird "1" in dem Zähler n gesetzt, um die Abtrennung für die Wörterbuch- Abtrennungseinheit bei p + 1_te (n=1) in der eingegebenen Zeichenfolge durchzuführen. Die Abtrennung wird auf dieselbe Weise wie beim Schritt 7302 durchgeführt. Das Wörterbuch 7018 wird für "Jan.," in der eingegebenen Zeichenfolge abgerufen, da dies als die Wörterbuch-Bezugseinheit, die "26" am nächsten ist, für die Verarbeitung abgeschieden worden ist, um die Morpheme-Verarbeitungsinformation zu schaffen (7420, 7422). Diese Verarbeitung wird in derselben Weise wie bei den Schritten 7306 und 7308 durchgeführt.When it is judged that to be adjusted, "1" is set in the counter n to perform the separation for the dictionary separation unit at p + 1 _te (n = 1) in the input string. The separation is performed in the same manner as in step 7302 . Dictionary 7018 is retrieved for "Jan.," in the input string, since this has been deposited as the dictionary reference unit closest to "26" for processing to provide the morpheme processing information ( 7420, 7422 ). This processing is performed in the same manner as in steps 7306 and 7308 .

Durch Wiederholen der Vorgänge vom Schritt 7412 bis zum Schritt 7422 wird das Ablaufdiagramm bis "26. Jan.,′80 he" geschlossen durchgeführt. Da jedoch "he" beim Schritt 4712 nicht zu der Anpassung bezüglich der Anpassungstabelle 7128 paßt, wird das Ablaufdiagramm beim Schritt 7415 auf den Schritt 7424 vorgerückt. Dies bedeutet, daß obwohl die Daten bis 26 Jan.,′80" mit "Kardinalzahl, Monat, Jahr" in der Anpassungstabelle 7128 angepaßt sind, sind sie nicht für "26, Jan.,′80 he" angepaßt.By repeating the operations from step 7412 to step 7422 , the flowchart up to "Jan 26, '80 he" is performed closed. However, since "he" does not match the adjustment with respect to the adjustment table 7128 in step 4712 , the flowchart advances to step 7424 in step 7415 . This means that although the dates up to 26 Jan. '80 "with" Cardinal number, month, year "in the adjustment table 7128 have been adapted, they are not adapted for" 26 Jan.,' 80 he ".

Wenn ferner die Sentenz an der eingegebenen Zeichenfolge beispielsweise "26. Jan.,′80′ "beendet ist, d. h. keine nächste Ausscheidung für die nächste Wörterbuch-Bezugseinheit vorhanden ist, wird im Flußdiagramm beim Schritt 7418 auf den Schritt 7424 vorgerückt. Für den Fall, daß festgestellt wird, daß beim Schritt 7414 nicht angepaßt ist, wird nunmehr beurteilt, ob der Inhalt des Zählers n nicht mehr als 1 ist oder nicht (7424), und wenn er nicht mehr als 1 ist, wird dies als eine einzige Bezugseinheit in der Tabelle 7126 aufgezeichnet (7434).Furthermore, when the sentence on the input string, for example "26. Jan., '80'" has ended, ie there is no next elimination for the next dictionary reference unit, step 7418 advances to step 7424 in the flowchart at step 7418 . In the event that it is determined that at step 7414 is not matched, it is now judged whether the content of the counter n is not more than 1 or not ( 7424 ), and if it is not more than 1, this is considered one only reference unit recorded in table 7126 ( 7434 ).

Wenn es nicht weniger als 1 ist, wird eine Anpassung durchgeführt, indem das p + n(n=3) genommen wird, d. h. "he" in "6 Jan.,′80 he" ist "EOS", wodurch das Ende der Zusammenstellung angezeigt wird (7426, 7428). Wenn es nicht angepaßt ist, wird im Flußdiagramm auf den Schritt 7434 vorgerückt. Wenn es angepaßt ist, wird "26 Jan., ′80", was p-(p + n-1) für die Wörterbuch-Bezugseinheit ist, kollektiv entsprechend dem Zusammenstellungsergebnis zusammengestellt, was der Zusammenstellung Ms in der Anpassungstabelle 7128 entspricht, und das Ergebnis wird in der Tabelle 7126 aufgezeichnet (7430). Dann wird beachtet, daß die Bezugseinheiten bei der (p + n-1) ten Einheit beendet worden sind und es wird (p + n. . .1) auf "p" rückgesetzt (7432).If it is not less than 1, an adjustment is made by taking the p + n (n = 3), ie "he" in "6 Jan., '80 he" is "EOS", thereby ending the compilation is displayed ( 7426, 7428 ). If it is not matched, step 7434 is advanced to in the flow chart. When adjusted, "Jan. 26, '80", which is p - ( p + n -1) for the dictionary reference unit, is collectively compiled according to the compilation result, which corresponds to the compilation Ms in the adaptation table 7128 , and that Result is recorded in table 7126 ( 7430 ). Then it is noted that the reference units at the (p + n-1) th unit have ended and (p + n ... 1 ) is reset to "p" ( 7432 ).

Nunmehr wird die achte Ausführungsform gemäß der Erfindung erläutert. In Fig. 63 ist der gesamte Aufbau der achten Ausführungsform eines Sprachanalysators gemäß der Erfindung dargestellt, welcher bei einer automatischen Übersetzungseinrichtung für Englisch/Japanisch angewendet wird.The eighth embodiment according to the invention will now be explained. In Fig. 63, the entire structure of the eighth embodiment is shown of a speech analyzer according to the invention, which / Japanese is applied to an automatic transmission device for English.

In Fig. 63 sind dargestellt, ein Eingabeabschnitt 8801, ein englischer Text 8002, ein Vorredigierabschnitt 8003, ein Abschnitt 8004 für eine morphologische Analyse, ein Abschnitt I 8005 für eine syntaktische Analyse, ein Abschnitt Ii 8006 für eine syntaktische Satzanalyse, ein Bedienungs- Anzeigeabschnitt 8007, ein Wort-Wörterbuch 8008, eine Regeldatei 8009, ein Steuerabschnitt 8010, ein Struktur- Umwandlungsabschnitt 8011, ein Abschnitt 8012 zum Erzeugen eines übersetzten Satzes, ein Nachredigierabschnitt 8013, ein Ausgabeabschnitt 8014 und ein japanischer Satz 8015. Diese Übersetzungseinrichtung hat, wie in Fig. 63 dargestellt, den Eingabeabschnitt 8001, über welche der englische Text 8002, der ins Japanische zu übersetzen ist, eingegeben wird. Der Eingabeabschnitt 8001 kann beispielsweise ein Tastenfeld mit Zeichentasten, wie alphanumerischen und Funktions-Tasten, eine optische Zeichenleseeinrichtung (OCR) zum Lesen des englischen Textes und/oder eine Dateispeichereinrichtung zum Lesen des englischen Textes aufweisen, welcher in einem Speichermedium, wie einer Magnetplatte aufgezeichnet ist. Fig. 63 shows an input section 8801 , an English text 8002 , a pre-editing section 8003 , a section 8004 for a morphological analysis, a section I 8005 for a syntactic analysis, a section Ii 8006 for a syntactical sentence analysis, an operating display section 8007 , a word dictionary 8008 , a rules file 8009 , a control section 8010 , a structure converting section 8011 , a section 8012 for generating a translated sentence, a redesigning section 8013 , an output section 8014 and a Japanese sentence 8015 . As shown in FIG. 63, this translation device has the input section 8001 via which the English text 8002 to be translated into Japanese is input. The input section 8001 may include , for example, a keypad with character keys such as alphanumeric and function keys, an optical character reader (OCR) for reading the English text, and / or a file storage device for reading the English text, which is recorded in a storage medium such as a magnetic disk .

Der englische in den Abschnitt 8001 eingegebene Text wird in den Vorredigierabschnitt 8003 eingelesen, wobei eine Vorbehandlung für die Übersetzung durchgeführt wird. In diesem Fall werden hauptsächlich eine Satzerkennung und eine Verarbeitung von unbekannten Worten durchgeführt. Dies fungiert als ein Teil der morphologischen Analyse.The English text entered in section 8001 is read into the pre-editing section 8003 , whereby a pre-treatment for the translation is carried out. In this case, sentence recognition and processing of unknown words are mainly performed. This acts as part of the morphological analysis.

Die vorredigierten, englischen Daten zusammen mit der bei Vorredigierung erhaltene Information an den Abschnitt 8004 abgegeben. Der Abschnitt 8004 unterteilt die Daten, indem er auf das Wort-Wörterbuch 8008 Bezug nimmt, zerlegt die englische Morpheme, führt verschiedene Arten von Zusammenlegungen durch, wie ein Verarbeiten von unbekannten Worten, ein Ausdrücken eines Eigennamens, Zeit und Zahlen, und führt eine Verarbeitung für den gesamten Satz durch, wie eine Zusatzfrage (tag question) und eine Appositions-Erkennung. Die Regeln für eine morphologische Analyse sind in der Regeldatei 8009 enthalten.The pre-edited English data, along with the information obtained from pre-edited, is provided to section 8004 . Section 8004 divides the data by referring to word dictionary 8008 , decomposes the English morphemes, performs various types of merging, such as processing unknown words, expressing a proper name, time and numbers, and processing for the entire sentence, like an additional question (tag question) and an apposition detection. The rules for a morphological analysis are contained in the rules file 8009 .

Die englischen Daten werden, nachdem sie der morphologischen Analyse unterzogen worden sind, zusammen mit der Wörterbuchinformation, welche durch die morphologische Analyse erhalten worden ist, an den Abschnitt I 8005 abgegeben. Der Abschnitt 1 8005 ist ein Funktionsabschnitt, welcher die Oberflächenschicht-Struktur für die Sätze zerlegt, indem eine grammatikalische Regel bei den englischen Daten angewendet wird, und findet alle strukturellen Möglichkeiten.The English data, after being subjected to the morphological analysis, together with the dictionary information obtained by the morphological analysis, are given to section I 8005 . Section 1 8005 is a functional section that breaks down the surface layer structure for the sentences by applying a grammatical rule to the English data and finds all structural possibilities.

Die englischen Daten werden, nachdem sie der syntaktischen Analyse in dem Abschnitt I 8005 unterzogen worden sind, zusammen mit der entsprechenden Information an den Abschnitt II 8006 abgegeben, wo eine Lösung aus dem Ergebnis der syntaktischen Analyse im Hinblick auf die Oberflächenschicht durch die syntaktische Analyse I ausgewählt wird, indem eine strukturelle Beschreibung angewendet wird. Folglich wird ein plausibler sogenannter Parsing- oder Analysebaum der englischen Beschreibung vorbereitet, um dessen Struktur auszubilden. Diese sogenannten Parsing- oder Zergliederungsregeln sind auch in der Regeldatei 8009 enthalten.The English data, after being subjected to the syntactical analysis in section I 8005 , together with the corresponding information, are submitted to section II 8006 , where a solution from the result of the syntactical analysis with regard to the surface layer by the syntactical analysis I is selected using a structural description. Consequently, a plausible so-called parsing or analysis tree of the English description is prepared to form its structure. These so-called parsing or segmentation rules are also contained in the rules file 8009 .

Die englischen Daten, welche der syntaktischen Analyse unterzogen worden sind, werden als die Daten für den entsprechenden Analyse-Baum an einen Struktur-Umwandlungsabschnitt 8011 übertragen. Der Abschnitt 8011 bereitet einen entsprechenden japanischen Strukturbaum aus dem Strukturbaum vor, welcher eine Zwischenstruktur des englischen Satzes ist, um ihn in eine dem japanischen unterliegende Struktur umzuwandeln, aus welcher der japanische Satz dann leicht übersetzt werden kann.The English data which has been subjected to the parsing is transmitted to a structure converting section 8011 as the data for the corresponding analysis tree. Section 8011 prepares a corresponding Japanese structure tree from the structure tree, which is an intermediate structure of the English sentence, in order to convert it into a structure underlying the Japanese, from which the Japanese sentence can then be easily translated.

Die Daten für den Strukturbaum, welcher die dem Japanischen zugrundeliegende Struktur anzeigt, die auf diese Weise umgeformt worden ist, werden an einen Übersetzungsabschnitt 8012 abgegeben, in welchem der übersetzte Satz erzeugt wird. Dies ist ein Funktionsabschnitt, um einen japanischen Satz aus der Struktur des dem japanischen Satz entsprechenden Strukturbaums zu erzeugen.The data for the structure tree, which indicates the underlying Japanese structure that has been transformed in this way, is supplied to a translation section 8012 , in which the translated sentence is generated. This is a functional section for creating a Japanese sentence from the structure of the structure tree corresponding to the Japanese sentence.

Die auf diese Weise übersetzten japanischen Satzdaten, d. h. die Daten für den japanischen Satz, werden an den Vorredigierabschnitt 8013 abgegeben, welcher die übersetzten Daten zum Abrufen des Speichers 8008 modifiziert, wobei die bei der Übersetzung benutzte Information dazu verwendet wird, einen natürlicheren japanischen Satz zu vervollkommnen. Die Daten für den japanischen Satz werden an den Ausgabeabschnitt 8014 übertragen und als der übersetzte japanische Satz 8015 von dem Ausgabeabschnitt 8014 abgegeben. Der Ausgabeabschnitt 8014 weist einen Drucker, ein Display und/oder eine Dateispeichereinrichtung, wie ein Magnetplatte auf.The Japanese sentence data thus translated, that is, the data for the Japanese sentence, is supplied to the pre-editing section 8013 , which modifies the translated data to retrieve the memory 8008 , using the information used in the translation to make the Japanese sentence more natural perfect. The data for the Japanese sentence is transferred to the output section 8014 and output as the translated Japanese sentence 8015 from the output section 8014 . The output section 8014 has a printer, a display, and / or a file storage device such as a magnetic disk.

Der Fluß einer Reihe von Übersetzungsvorgängen wird durch den Steuerabschnitt 8010 gesteuert, welcher die Steuerung für die gesamte Einrichtung regelt. Das Wort-Wörterbuch 8008 enthält im Falle der dargestellten Ausführungsform Wörterbuchdaten für englische und japanische Wörter, wobei außerdem dem Vokabular verschiedene Informationen, wie eine verbindende Beziehung, d. h. eine gleichzeitig bestehende Beziehung, Bedeutungen, Singular- und Pluralformen einen Sprachteil usw. festgelegt werden. Ferner enthält die Regeldatei 8009 Regeldaten für die morphologische Analyse und für eine englische Satzzergliederung. The flow of a series of translation operations is controlled by control section 8010 , which controls control for the entire facility. In the illustrated embodiment, the word dictionary 8008 contains dictionary data for English and Japanese words, and various information, such as a connecting relationship, that is to say a concurrent relationship, meanings, singular and plural forms, a language part, etc., is also specified for the vocabulary. The rule file 8009 also contains rule data for morphological analysis and for an English sentence breakdown.

Der Steuerabschnitt 8010 ist mit dem Bedienungsanzeigeabschnitt 8007 verbunden, welcher Operationstasten, um verschiedene Befehle von einem Operator an die Einrichtung abzugeben, so beispielsweise eine Übersetzungs-Befehlstaste, eine Cursortaste, usw. ein Display oder eine Anzeige, welche visuell den eingegebenen englischen Satztext, den japanischen Satz als Ergebnis der Übersetzung, Zwischendaten, wie eine Wörterbuch-Information und verschiedene Befehle an den Operator anzeigt. Sie kann so ausgelegt werden, daß die meisten dieser Bedienungsanzeigefunktionen in dem Tastenfeld vorhanden sind, falls es in dem Eingabeabschnitt 8001 vorgesehen ist oder an der Anzeige vorhanden ist, falls diese in dem Ausgabeabschnitt 8014 vorgesehen ist.The control section 8010 is connected to the operation display section 8007 , which has operation keys for issuing various commands from an operator to the device, such as a translation command key, a cursor key, etc., a display or a display which visually shows the inputted English sentence text Japanese sentence as a result of translation, intermediate data, such as dictionary information and various commands to the operator. It can be designed so that most of these operation display functions are present in the keypad if it is provided in the input section 8001 or on the display if it is provided in the output section 8014 .

Die Ausführungsform der Erfindung betrifft die automatische Übersetzungseinrichtung, wie sie oben beschrieben ist, und ist dazu verwendet, damit, wenn ein abgeleitetes Wort in dem englischen Text 8002 enthalten ist, ein grammatisches Merkmal, ein semantisches Merkmal, das (japanische) äquivalent, usw. in Abhängigkeit von den Bedingungen für das abgeleitete Wort bewertet werden, das beispielsweise durch eine Hinzufügung erkannt worden ist, um dadurch die Zuverlässigkeit für das erhaltene Zergliederungsergebnis oder das Übersetzungsergebnis zu erhöhen. Ein Affix-Wörterbuch wird für die Bewertung der Wörterbuchinformation bei unbekannten Wörtern in der morphologischen Analyse verwendet. Als Verarbeitungsmodes werden drei Arten von Verarbeitungen gefordert, d. h. eine Verarbeitung für die Vorsilbe bzw. das Prefix, ein Präfix für das Suffix bzw. die Nachsilbe und die bewertete Verarbeitung durch das Suffix. Jedoch enthalten die Datenarten im wesentlichen zwei Typen, d. h. Präfix- und Suffix-Bewertungsdaten.The embodiment of the invention relates to the automatic translation device as described above, and is used so that when a derived word is contained in the English text 8002 , a grammatical characteristic, a semantic characteristic equivalent (Japanese), etc. depending on the conditions for the derived word, which has been recognized by an addition, for example, to thereby increase the reliability of the result of the decomposition or the translation. An affix dictionary is used to evaluate dictionary information for unknown words in the morphological analysis. Three types of processing are required as processing modes, ie processing for the prefix or prefix, a prefix for the suffix or suffix and the evaluated processing by the suffix. However, the types of data essentially contain two types, that is, prefix and suffix rating data.

Als erstes werden die Präfix- und die Suffix-Bewertungsdaten im einzelnen erläutert.First, the prefix and suffix rating data explained in detail.

(1) Prefix rating data

Wenn der Präfix- bzw. Vorsilbenteil eines nicht im Wörterbuch registrierten Wortes mit dem nachstehend beschriebenen (Teil) übereinstimmt und der Restteil in dem Wörterbuch vorhanden ist, wird das Wort entsprechend den Wörterbuchdaten für dessen Wurzel behandelt. Bei den Wörterbuchdaten ist es möglich, ein internes Merkmal zu der Reihe des ursprünglichen internen Merkmals hinzuzufügen und ein japanisches Suffix dem ursprünglichen Übersetzungswort hinzuzufügen. Für das eingebene Wort "elektrochemisch" beispielsweise kann der Eingang "elektrochemisch" in der Wörterbuch-Informations- Konservierungstabelle entsprechend den Wörterbuchdaten durch den Präfixeingang "elektro" und den Wörterbucheingang "chemisch" gebildet werden, wie nachstehend gezeigt ist:If the prefix or prefix part of one is not in the dictionary registered word with the one described below (Part) matches and the remainder in the dictionary exists, the word will match the dictionary data treated for its root. With the dictionary data is it is possible to add an internal characteristic to the series of the original add internal feature and a Japanese Add suffix to the original translation word. For the word "electrochemical", for example can the "electrochemical" input in the dictionary information Preservation table according to the dictionary data through the prefix input "electro" and the dictionary input "chemically" as shown below is:

PräfixeingabeWörterbucheingabe elektrochemisch eingegebenes WortWörterbuch-Informations-
Tabellen-Eingang elektrochemischelektrochemischPrefix Entry Dictionary Entry Electrochemically Entered Word Dictionary Information
Table input electrochemical electrochemical

Der Eingang in der Tabelle übernimmt dann alle die Wörterbuch- Informationen, welchen die Wörterwurzeln entsprechen.The input in the table then takes over all the dictionary Information to which the word roots correspond.

(2) Suffix evaluation data

Wenn der Suffix- oder Nachsilbenteil eines nicht im Wörterbuch registrierten Wortes mit dem nachstehend beschriebenen (Teil) übereinstimmt und der restliche Teil in dem Wörterbuch vorhanden ist, wird dieses Wort registriert, indem neue Wörterbuchdaten entsprechend der Information geschaffen werden, die in dem Suffix-Wörterbuch beschrieben ist. In diesem Fall wird das erste (Japanische) Äquivalent in den Wörterbuchdaten, die dem Wurzelteil des Wortes entsprechen herausgenommen und für das (Japanische) Äquivalent in den neuen Wörterbuchdaten verwendet.If the suffix or suffix part of one is not in the dictionary registered word with the one described below (Part) matches and the rest of the part in the dictionary this word is registered by adding new Dictionary data is created according to the information, which is described in the suffix dictionary. In this Case becomes the first (Japanese) equivalent in the dictionary data, that correspond to the root part of the word taken out and for the (Japanese) equivalent in the new dictionary data used.

Für das eingegebene Wort "controler" beispielsweise wird ein Eingang in der Tabelle "controler" basierend auf dem Suffix-Eingang "-(e)r" und den Wörterbuch-Eingang "control" registriert.For example, for the entered word "controler" an input in the table "controler" based on the Suffix input "- (e) r" and the dictionary input "control" registered.

SuffixeingangWörterbucheingang (Verb)-(e)rcontrol HauptwortVerb EingangswortWörterbuchinformations-
Tabelleneingang controlercontroler HauptwortSuffix input dictionary input (verb) - (e) rcontrol main word verb input word dictionary information
Table entry controlercontroler noun

Nunmehr wird der Umriß für die Verarbeitung eines abgeleiteten und die Verarbeitung eines unbekannten Wortes erläutert.Now the outline for processing a derived one and explains the processing of an unknown word.

(1) If for a word not registered in the dictionary a prefix or suffix at the beginning or end of a Word and the rest of the word in the dictionary is registered, become an English part a language information, an internal feature and the Japanese Equivalent based on the dictionary and the Affix information composed.
(2) Prefix and suffix are listed as one film system and can be edited independently of a program.
(3) First the possibility for the prefix is tried and if this is not possible, the possibility for tried the suffix. If both are included, no attempt made.
(4) For the word that failed the trial evaluation is processing for the end part as one unknown word performed.

Nunmehr wird insbesondere eine achte Ausführungsform gemäß der Erfindung anhand eines Blockdiagramms in Fig. 58 erläutert. In Fig. 58 sind dargestellt, ein Eingabeverarbeitungsabschnitt 8020, ein Eingabe-Aufteilabschnitt 8021, eine Abgrenzungstabelle 8022, ein Wörterbuch-Abrufabschnitt 8026, ein Ableitungs-Verarbeitungsabschnitt 8023, ein Bezugswörterbuch 8024 und eine Konservierungstabelle 8025 für eine Wörterbuch-Information. Zuerst wird ein englischer Satz in den Eingabeverarbeitungsabschnitt von einer Eingabeeinrichtung aus eingeschrieben, welche eine Vorlageneingabedatei oder eine Tastatur, ein ORC usw. aufweist. Dann wird die Wörterbuch-Bezugseinheit in dem hierfür vorgesehenen Abschnitt aufgeteilt, wobei auf die Abgrenzungstabelle Bezug genommen wird und wenn dies nicht zu beenden ist, wird ein Wörterbuch-Versuch mit Hilfe des Bezugswörterbuchs durchgeführt. Wenn es als Ergebnis des Versuchs einen Eingang gibt, wird das Versuchsergebnis in der Konservierungstabelle für Wörterbuch-Information aufgezeichnet, während eine Verarbeitung für ein abgeleitetes Wort durchgeführt wird, wenn es keinen Eingang gibt.In particular, an eighth embodiment according to the invention will now be explained with reference to a block diagram in FIG. 58. In Fig. 58 are shown, an input processing section 8020, an input Aufteilabschnitt 8021, a definition table 8022, a dictionary retrieval section 8026, a derivation processing section 8023, a reference dictionary 8024 and a preservative table 8025 for a dictionary information. First, an English sentence is written in the input processing section from an input device having a template input file or keyboard, an ORC, and so on. The dictionary reference unit is then divided into the section provided for this purpose, reference being made to the delimitation table, and if this cannot be ended, a dictionary attempt is carried out using the reference dictionary. If there is an input as a result of the attempt, the test result is recorded in the dictionary information preservation table, while processing for a derived word is performed when there is no input.

Fig. 59 ist ein Blockdiagramm zur Erläuterung einer Ausführungsform der Ableitungsverarbeitung mittels eines Präfix. In Fig. 49 sind dargestellt ein Anpassungsabschnitt 8030 zwischen dem oberen Teil und dem Präfix-Wörterbuch, ein Präfix-Wörterbuch 8031, ein Abschnitt 8030 für einen Wörterbuch-Abruf außer für den Präfixteil, ein Anpassungsabschnitt 8033 für den Sprachteil in dem Präfix-Wörterbuch und den Sprachteil des Eingangs, ein Wörterbuch-Informationsvorbereitungsabschnitt 8034 durch die Präfix-Bewertung und ein Ableitungs-Verarbeitungsabschnitt 8035 durch das Präfix. Fig. 59 is a block diagram for explaining an embodiment of the discharge processing by a prefix. Fig. 49 shows an adaptation section 8030 between the upper part and the prefix dictionary, a prefix dictionary 8031 , a section 8030 for dictionary retrieval except for the prefix part, an adaptation section 8033 for the speech part in the prefix dictionary and the speech part of the input, a dictionary information preparation section 8034 by the prefix rating and a derivation processing section 8035 by the prefix.

Fig. 60 ist ein Blockdiagramm zur Erläuterung einer Ausführungsform der Ableitungsverarbeitung mittels eines Suffix. In Fig. 60 sind dargestellt, ein Anpassungsabschnitt 8040 zwischen dem Endteil und dem Suffix-Wörterbuch, ein Suffix-Wörterbuch 8041, ein Verarbeitungsabschnitt 8042 zum Vervollständigen eines (ganz unbekannten) nichtregistrierten Wortes, ein Wörterbuchabrufabschnitt 8043 um das Wörterbuch bezüglich des Teils außer des Suffix-Teils abzurufen, einen Anpassungsabschnitt 8044 zwischen dem Sprachteil der Wurzel und dem Eingangssprachteil in dem Suffix, ein Wörterbuch-Information vorbereitender Verarbeitungsabschnitt 8045 durch eine Suffix-Bewertung, ein Wörterbuch- Abrufabschnitt 8046 zum Durchführen eines Wörterbuchabrufs durch Hinzufügen der Wurzeländerung in dem Suffix an dem Teil außer für den Suffix-Teil und eine ein nicht registriertes Wort verarbeitender Abschnitt 8074 mittels einer Suffix- Bewertung. Zuerst wird eine Anpassung zwischen dem Endteil und dem Suffix-Wörterbuch gemacht. Wenn es nicht angepaßt ist, wird das Wort als ein vollständig (nicht unbekanntes) nichtregistriertes Wort verarbeitet, während, wenn es angepaßt ist, das Wörterbuch für den Teil außer für den Suffix- Teil abgerufen wird. Wenn als Ergebnis des Wörterbuch-Abrufens kein Eingang vorhanden ist, wird ein Wörterbuch-Abrufen durchgeführt, während die Wurzeländerung in dem Suffix- Wörterbuch zu einem Teil außer für den Suffix-Teil hinzugeführt wird. Wenn im Ergebnis kein Eingang vorhanden ist, wird eine Verarbeitung für ein nicht registriertes Wort mittels der Suffix-Bewertung durchgeführt. Während andererseits, wenn ein Eingang vorhanden ist, eine Anpassung zwischen dem Sprachteil des Eingangs und dem Sprachteil der Wurzel in dem Suffix-Wörterbuch auf dieselbe Weise wie in dem Fall durchgeführt wird, daß der Eingang als Ergebnis des Wörterbuch-Abrufens für den Teil außer für den Suffix-Teil vorhanden ist. Wenn es angepaßt ist, wird eine Wörterbuch- Informations-Vorbereitungsverarbeitung mittels einer Suffix- Bewertung durchgeführt, während, wenn es nicht angepaßt ist, eine Verarbeitung für ein nicht angepaßtes Wort mittels einer Suffix-Bewertung durchgeführt wird. Fig. 60 is a block diagram for explaining an embodiment of the discharge processing by a suffix. Shown in Fig. 60 are an adaptation section 8040 between the end part and the suffix dictionary, a suffix dictionary 8041 , a processing section 8042 for completing a (completely unknown) unregistered word, a dictionary retrieving section 8043 for the dictionary with respect to the part other than the suffix Part, a matching section 8044 between the language part of the root and the input language part in the suffix, a dictionary information preprocessing processing section 8045 by suffix evaluation, a dictionary retrieving section 8046 for performing dictionary retrieval by adding the root change in the suffix to the Part except for the suffix part and a non-registered word processing section 8074 by means of a suffix evaluation. First, an adjustment is made between the end part and the suffix dictionary. If it is not matched, the word is processed as a fully (not unknown) unregistered word, while if it is matched the dictionary is retrieved for the part except for the suffix part. If there is no input as a result of dictionary retrieval, dictionary retrieval is performed while the root change in the suffix dictionary is added in part except for the suffix part. If there is no input in the result, processing for an unregistered word is performed using the suffix evaluation. On the other hand, while if there is an input, an adjustment between the speech part of the input and the speech part of the root is performed in the suffix dictionary in the same way as in the case where the input is the result of dictionary retrieval for the part except for the suffix part is present. If it is matched, dictionary information preparation processing is performed by a suffix evaluation, while if it is not adapted, processing for an unmatched word is performed by a suffix evaluation.

In Fig. 61 ist ein Blockdiagramm dargestellt, das Einzelheiten des gesamten Aufbaus zeigt, welcher durch ein Zusammenfügen der Teile der Fig. 58 bis 60 erhalten worden ist; Fig. 652 stellt die Einzelheiten für den in Fig. 61 dargestellten Verarbeitungsabschnitt 8052 für das vollständig (kaum unbekannte) nichtregistrierte Wort dargestellt. Der in Fig. 61 dargestellte Verarbeitungsabschnitt 8042 für das vollständig nicht registrierte Wort weist einen Verarbeitungsabschnitt 8050 für eine bewährte Informations- Vorbereitung für ein Hauptwort und einen entsprechenden Abschnitt 8051 für ein Verb auf. Da jedoch jeder der Teile in Fig. 61 und 62 bereits im einzelnen beschrieben worden ist, brauchen sie nicht noch einmal erläutert zu werden. Fig. 61 is a block diagram showing details of the whole structure obtained by assembling the parts of Figs. 58 to 60; FIG. 652 shows the details for the processing section 8052 shown in FIG. 61 for the fully (hardly unknown) unregistered word. The fully unregistered word processing section 8042 shown in Fig. 61 has a processing section 8050 for a proven information preparation for a noun and a corresponding section 8051 for a verb. However, since each of the parts in Figs. 61 and 62 has already been described in detail, they need not be explained again.

Nunmehr wird eine neunte Ausführungsform gemäß der Erfindung beschrieben. Hierbei ist in Fig. 64 der gesamte Aufbau der neunten Ausführungsform gemäß der Erfindung dargestellt, welche bei einer automatischen Übersetzungseinrichtung für Englisch-Japanisch angewendet ist. Diese Ausführungsform hat einen Eingabeabschnitt 9010, über welche ein englischer Text 9012, welcher in Japanische zu übersetzen ist, eingegeben wird. Der Eingabeabschnitt 9010 kann ein Tastenfeld mit Zeichentasten, wie alphanumerische oder Funktions-Tasten, eine optischen Zeichenleseeinrichtung (ORC) zum Lesen des auf Papier aufgezeichneten englischen Textes und/oder einer Dateispeichereinrichtung zum Lesen des englischen Textes aufweisen, welcher auf ein Speichermedium, wie einer Magnetplatte, aufgezeichnet ist.A ninth embodiment according to the invention will now be described. Here, 64 of the entire structure of the ninth embodiment is shown in Fig. According to the invention, which is applied to an automatic transmission device for English-Japanese. This embodiment has an input section 9010 via which an English text 9012 to be translated into Japanese is input. The input section 9010 may include a keypad with character keys such as alphanumeric or function keys, an optical character reader (ORC) for reading the English text recorded on paper, and / or a file storage device for reading the English text, which is stored on a storage medium such as a magnetic disk , is recorded.

Der von dem Abschnitt 9010 eingegebene, englische Text wird in einen Vorredigierabschnitt 9014 gelesen, in welchem die Übersetzungs-Vorbehandlung durchgeführt wird. In diesem Fall werden hauptsächlich eine Satzerkennung und eine Verarbeitung von unbekannten Worten durchgeführt. Dies fungiert dann als ein Teil einer morphologischen Analyse.The English text input from the section 9010 is read into a pre-editing section 9014 in which the translation pre-treatment is performed. In this case, sentence recognition and processing of unknown words are mainly performed. This then acts as part of a morphological analysis.

Die vorredigierten englischen Daten werden zusammen mit der bei der Vorredigierung erhaltenen Information an einen Abschnitt 9016 für die morphologische Analyse übertragen. Der Abschnitt 9016 unterteilt den Satz für ein Abfragen eines Wort-Wörterbuchs 9018, durch Zergliedern der englischen Morpheme, führt verschiedene Anordnungen, wie eine Verarbeitung von unbekannten Wörtern, eine Eigennamen-Verarbeitung, einen Ausdruck für Zeit, Zahl, usw. und führt auch eine Verarbeitung für den gesamten Satz wie eine Zusatzfrage (tag question) und eine Appositions-Erkennung durch. Die Regeln für eine morphologische Analyse sind in einer Regeldatei 9036 enthalten.The pre-edited English data, along with the information obtained from the pre-editing, is transferred to a section 9016 for morphological analysis. Section 9016 divides the phrase for a word dictionary query 9018 by dissecting the English morphemes, performs various arrangements such as unknown word processing, proper name processing, expression for time, number, etc., and also performs one Processing for the entire sentence such as an additional question (tag question) and an apposition detection by. The rules for a morphological analysis are contained in a rules file 9036 .

Die englischen Daten, welche der morphologischen Analyse unterzogen worden sind, werden zusammen mit der Wörterbuch- Information, welche aus der morphologischen Analyse erhalten worden ist, an einen Abschnitt I 9020 für syntaktische Analyse übertragen. Der Abschnitt I 9020 in dieser Ausführungsform ist ein Funktionsabschnitt, welcher die Oberflächenschicht-Struktur von unten nach oben und von rechts nach links für den Satz zergliedert, indem die CFG- Regel bei den englischen Daten angewendet wird, und findet alle strukturellen Möglichkeiten.The English data which has been subjected to the morphological analysis, together with the dictionary information obtained from the morphological analysis, are transmitted to a section I 9020 for syntactic analysis. Section I 9020 in this embodiment is a functional section that breaks down the surface layer structure from bottom to top and right to left for the sentence by applying the CFG rule to the English data and finds all structural possibilities.

Die englischen Daten werden, nachdem die in dem Abschnitt I 9020 syntaktisch analysiert worden sind, zusammen mit der analysierten Information in einen Abschnitt II 9022 für syntaktische Analyse abgegeben. Der Abschnitt wählt eine Lösung für das Ergebnis der syntaktischen Analyse im Hinblick auf die Oberflächenschicht-Struktur in dem Abschnitt I durch Anwenden einer strukturellen Beschreibung aus. Auf diese Weise wird ein plausibler Parsing- bzw. Analysebaum für einen englischen Satz vorbereitet, um dessen Struktur zu bilden. Diese sogenannten Parsing-Regeln sind ebenfalls in der Regeldatei 9036 gespeichert. Die englischen Daten, die der syntaktischen Analyse unterworfen worden sind, werden als die Daten für den sogenannten Parsing-Baum an den Strukturumwandlungsabschnitt 9024 übertragen. In dem Abschnitt 9024 wird der sogenannte Parsing-Baum für den entsprechenden japanischen Satz für einen strukturellen Baum vorbereitet, welcher eine Zwischenstruktur eines englischen Satzes ist und in eine dem japanischen unterliegende Struktur umgeformt wird, aus welcher dann ein japanischer Satz leicht übersetzt werden kann. Die Daten für den strukturellen Baum, welcher die dem japanischen zugrundeliegende Struktur anzeigt, die auf diese Weise umgewandelt worden ist, werden an einen Übersetzungsabschnitt 9026 abgegeben, in welchem der übersetzte Satz erzeugt wird. Dies ist eine Funktion, um einen japanischen Satz aus einer Baumstruktur des japanischen Strukturbaums zu erzeugen. Zuerst wird eine Satzstruktur erzeugt, indem der Strukturbaum durch Austauschen der Reihenfolge geändert wird, um so mit der Struktur des Japanischen übereinzustimmen und dann wird eine Morpheme-Erzeugung durchgeführt, um einen übersetzten Satz in der Form von oben nach unten und von links nach rechts in dem Satz-Strukturbaum zu erzeugen.The English data, after having been parsed in section I 9020 , is submitted along with the parsed information to section II 9022 for parsing. The section selects a solution to the result of the syntactic analysis with respect to the surface layer structure in section I by applying a structural description. In this way, a plausible parsing or analysis tree is prepared for an English sentence in order to form its structure. These so-called parsing rules are also stored in the rule file 9036 . The English data which has been subjected to the parsing analysis is transmitted to the structure converting section 9024 as the data for the so-called parsing tree. In section 9024 , the so-called parsing tree for the corresponding Japanese sentence is prepared for a structural tree, which is an intermediate structure of an English sentence and is transformed into a structure underlying the Japanese, from which a Japanese sentence can then be easily translated. The data for the structural tree, which indicates the underlying Japanese structure that has been converted in this way, is provided to a translation section 9026 , in which the translated sentence is generated. This is a function to create a Japanese sentence from a tree structure of the Japanese structure tree. First, a sentence structure is created by changing the structure tree by swapping the order so as to conform to the structure of Japanese, and then morpheme creation is performed to convert a translated sentence from top to bottom and left to right in the sentence structure tree.

Die Daten für den auf diese Weise erzeugten japanischen Strukturbaum sind Übersetzungsdaten, die an einen Nachredigierabschnitt 9030 abgegeben werden, in welchem die Übersetzungsdaten durch Abrufen des Wörterbuchs 9018 mit Hilfe der Information modifiziert werden, welche bei der Übersetzungs- Verarbeitung benutzt worden sind, um einen natürlicheren japanischen Satz zu vervollständigen. Die Daten für den japanischen Satz werden an einen Ausgabeabschnitt 9032 übertragen und werden dann als der übersetzte japanische Satz 9034 von dem Ausgabeabschnitt 9032 aus abgegeben. Der Ausgabeabschnitt 9032 weist beispielsweise einen Drucker, ein Display und/oder eine Datei-Speichereinrichtung, wie eine Magnetplatte auf. Der Fluß einer Reihe von Übersetzungsvorgängen wird durch einen Steuerabschnitt 9038 gesteuert, welcher die Steuerung für die gesamte Einrichtung regelt. Die Wörterbuchdateien 9018 speichert Wörterbuchdaten für englische und japanische Worte und die Regeldatei 9036 speichert in dieser Ausführungsform Daten für die morphologische und die syntaktische Analyse.The data for the Japanese structure tree thus created is translation data that is supplied to a redesigning section 9030 , in which the translation data is modified by retrieving the dictionary 9018 with the help of the information used in the translation processing to make it more natural to complete Japanese sentence. The data for the Japanese sentence are transferred to an output section 9032 and then output as the translated Japanese sentence 9034 from the output section 9032 made. The output section 9032 has, for example, a printer, a display and / or a file storage device, such as a magnetic disk. The flow of a series of translation operations is controlled by a control section 9038 which regulates control for the entire facility. The dictionary files 9018 stores dictionary data for English and Japanese words, and the rules file 9036 stores data for morphological and syntactic analysis in this embodiment.

Der Steuereinheit 9038 ist mit einem Bedienungs-Anzeigeabschnitt 9040 verbunden, welcher Bedienungstasten, um verschiedene Befehle von einem Operator an die Einrichtung abzugeben, beispielsweise eine Übersetzungs-Befehlstaste oder eine Cursortaste, und ein Display oder eine Anzeigeeinrichtung auf, welche visuell einen eingegebenen englischen Text, einen japanischen Satz als das Übersetzungsergebnis, Zwischendaten, wie eine Wörterbuch-Information und verschiedene Befehle für den Operator anzeigt. Sie kann so ausgebildet sein, daß die meisten Bedienungs-Anzeigefunktionen in einem Tastenfeld enthalten sind, wenn es an dem Eingabeabschnitt 9010 angeordnet ist, oder bei einer Anzeige, wenn sie an dem Ausgabeabschnitt 9032 angeordnet ist. In dem Abschnitt I für eine syntaktische Analyse wird die cfg-Regel bei dem englischen Satz von oben nach unten und von rechts nach links für die englischen Daten nach einer morphologischen Analyse angewendet, um alle möglichen Lösungen für die Satzstruktur abzuleiten. Die Lösungen sind im allgemeinen in Form eines Strukturbaums zu verstehen. Dieser zeigt eine Beziehung für Worte oder Gruppen, die in jedem der Sätze enthalten sind die zueinander in einer untergeordneten oder einer gleichzeitig bestehenden Beziehung, wie einer modifizierten Beziehung oder einer Fallbeziehung, einer untergeordneten Beziehung zueinander, wie beispielsweise zwischen Eltern, Kind und Enkel usw. in Beziehung stehen. Jedes der Worte oder der Gruppe befindet sich an dem Knoten des Strukturbaumes.The control unit 9038 is connected to an operation display section 9040 which has operation keys for issuing various commands from an operator to the device, such as a translation command key or a cursor key, and a display or a display device which visually displays an inputted English text, a Japanese sentence as the translation result, intermediate data such as dictionary information and various commands for the operator. It may be configured so that most of the operation display functions are included in a keypad when it is located on the input section 9010 or on a display when it is located on the output section 9032 . In section I for a syntactic analysis, the cfg rule is used for the English sentence from top to bottom and from right to left for the English data after a morphological analysis in order to derive all possible solutions for the sentence structure. The solutions are generally to be understood in the form of a structure tree. This shows a relationship for words or groups contained in each of the sentences that are related to each other in a subordinate or coexisting relationship, such as a modified relationship or a case relationship, a subordinate relationship to one another, such as between parents, child and grandchildren, etc. related. Each of the words or the group is located at the node of the structure tree.

In dieser Ausführungsform werden von der syntaktischen Analyse Merkmale, die die Form beachten, und Vokabularien eines Satzes unterschieden, um die kollektive Zusammenstellung im Hinblick auf die Satzstruktur zu beurteilen. Die Zusammenstellung im Hinblick auf die Satzstruktur wird hier als "Einheit" und "Block" bezeichnet.In this embodiment, the syntactical analysis Features that respect shape and vocabularies one sentence differentiated to the collective compilation to judge in terms of sentence structure. The Compilation with regard to the sentence structure is here referred to as "unit" and "block".

Die "Einheit" ist ein Wort oder eine Gruppe von Worten, welche die kleinste Einheit für den Übersetzungsvorgang bilden, welcher identisch mit einem Wort in der syntaktischen Analyse behandelt wird, und die Wörterbuch-Information für jedes der darin enthaltenen, gesetzmäßigen Elemente wird nicht verwendet.The "unity" is a word or a group of words, which is the smallest unit for the translation process form which is identical to a word in the syntactic Analysis is treated, and the dictionary information for each of the legal elements contained therein is not used.

Ein "Block" ist eine strukturelle Zusammenlegung, bei welcher die syntaktische Analyse vorzugsweise für deren Inneres bezüglich des äußeren Teils durchgeführt wird und welche in einer äquivalenten Weise als eine Einheit bezüglich deren äußeren Teil behandelt wird. Beispielsweise kann es ein Satzteil, eine Phrase usw. sowie das sein, was den Zwischensymbolen, entspricht, welche in der cfg-Grammatik verwendet worden sind. Es kann auch eine Neststruktur haben, d. h. ein Block kann ferner einen anderen Block enthalten. Ferner kann der Begriff des Blockes auch ein Satz, ein Absatz, und ganze Sätze einschließen, von denen jeder als ein Block betrachtet werden kann. Die Verarbeitung, welche der partiellen syntaktischen Analyse einen Vorzug gibt, wird hier als eine partielle Zergliederung oder Analyse "bezeichnet". Hierdurch können unwirtschaftliche strukturelle Lösungen, die oben beschrieben worden sind, gemindert werden, um den Analysewirkungsgrad zu verbessern, und um ein glaubhafteres Analyseergebnis zu erhalten.A "block" is a structural merger in which the syntactic analysis preferably for their interior is carried out with respect to the outer part and which in an equivalent way as a unit with respect to them outer part is treated. For example, it can be a Part of a sentence, a phrase etc. as well as what the intermediate symbols, corresponds to which is used in the cfg grammar have been. It can also have a nest structure, i.e. H. a Block can also contain another block. Furthermore, the term block also includes a sentence, a paragraph, and whole Include sentences, each of which is considered a block can be. The processing, which of the partial Syntactic analysis gives preference here as one partial dissection or analysis "designated". Hereby can be uneconomical structural solutions that have been described above, are degraded to the analysis efficiency to improve, and to make a more believable Get analysis result.

Für den Block werden in dieser Ausführungsform zwei Merkmale definiert. Eines davon ist ein Symbol in der cfg-Regel, das in der vorliegenden Beschreibung als "Ziel" bezeichnet wird, welches als das Ergebnis der Analyse zusammenzusetzen ist, die bei jeder der konstitutionellen Elemente im Inneren des Blockes durchgeführt worden ist, d. h. ein Symbol, das die Struktur oder die Eigenschaft des Blockes beschreibt. Das andere ist ein Symbol in der cfg-Regel, das als eine "Rolle" bezeichnet wird, welche von dem Block getragen ist, wenn die syntaktische Analyse bei dem Äußeren des Blockes für den Satz, die Phrase oder den Satzteil durchgeführt wird, in welchem der Block enthalten ist. D. h. es ist ein Symbol, das die Beziehung oder die Rolle des Blockes zu den anderen beschreibt. For the block, two features become in this embodiment Are defined. One of them is a symbol in the cfg rule, referred to as "target" in the present specification will put which one as the result of the analysis is that with each of the constitutional elements inside of the block has been performed, d. H. a symbol that describes the structure or property of the block. The other is a symbol in the cfg rule, as one "Role" is referred to, which is carried by the block, if the parsing on the outside of the block for the sentence, phrase or part of the sentence is carried out, in which the block is contained. I.e. it is a symbol that is the relationship or role of the block to the others describes.

Beispielsweise ist im Falle eines englischen Satzes I "White House isn′t white" ist das Ziel ein Satz, und die Rolle ist ein Hauptwort (ein Satzteil). Obwohl das Ziel und die Rolle in den meisten Fällen im allgemeinen identisch sind, sind sie wie in diesem Fall manchmal verschieden voneinander.For example, in the case of an English sentence I "White House isn't white" is the goal of a sentence, and that Role is a noun (a part of a sentence). Although the goal and the role is generally identical in most cases are, as in this case, sometimes different from each other.

Wenn für die in Fig. 62 dargestellte Ausführungsform die strukturelle Anordnung des eingegebenen englischen Satzes als ein Block erkannt wird, werden die funktionellen Abschnitte, um deren Ziel und deren Rolle zu bewerten, in der Struktur zusammengefaßt, wie sie in Fig. 65 dargestellt ist. Wie aus Fig. 65 zu ersehen ist, wird die strukturelle Anordnung der englischen Satzarten, welche in dem Abschnitt 9014 vorredigiert worden sind, in dem Abschnitt 9016 für eine morphologische Analyse mit Hilfe des Wort-Wörterbuchs 9018 und der Regeldatei 9036 unterschieden.For the embodiment shown in FIG. 62, if the structural arrangement of the input English sentence is recognized as a block, the functional sections to evaluate their purpose and role are summarized in the structure as shown in FIG. 65. As can be seen from Fig. 65, the structural arrangement of the English sentence types which have been pre-edited in section 9014 is distinguished in section 9016 for morphological analysis using the word dictionary 9018 and the rule file 9036 .

Das Wort-Wörterbuch 9018 speichert Wörterbuch-Informationen für englische Wörter und Phrasen. Beispielsweise werden, wie in Fig. 68 dargestellt, Eingänge jeweils mit Variationen für jedes Wort in dieser Ausführungsform ausgebildet und es werden alle Informationen entwickelt. Für die Sprachteil- Information kann beispielsweise eine Anzahl Sprachteil-Informationen vorgesehen sein, wie in Fig. 68 dargestellt ist. Der Weg das Wörterbuch 9018 zu bilden, ist nicht nur auf dieses Beispiel beschränkt.The word dictionary 9018 stores dictionary information for English words and phrases. For example, as shown in Fig. 68, inputs are formed with variations for each word in this embodiment, and all information is developed. For example, a number of pieces of speech information can be provided for the speech part information, as shown in FIG . The way to build dictionary 9018 is not limited to this example only.

Die Regeldatei 9036 speichert diese Daten für den oberen Zustand, welcher das obere Ende des Blockes anzeigt, den End-Zustand, der das Ende anzeigt, sowie eine Blockvorbereitungsinformation zum Schaffen des Blockes mit dem Ziel und der Rolle in Form einer Tabelle. Ein Beispiel hierfür ist in Fig. 69 dargestellt. Beispielsweise wird ein Block gestartet durch "Konjunktion" und wird beendet mit dem Ende des Satzes. Folglich wird ein Block gebildet, der von oben mit "," beginnt, was deren Konjunktion unmittelbar voraus geht, und das Ziel ist ein Satzteil bzw. eine Klausel, während die Rolle ein Satz ist. Ferner wird ein weiterer Block, ausgehend von der Konjunktion bis zum Ende des Satzes gebildet, in welchem sowohl das Ziel als auch die Rolle ein Satzteil bzw. eine Klausel sind.The rule file 9036 stores this data for the upper state indicating the upper end of the block, the end state indicating the end, and block preparation information for creating the block with the target and the role in the form of a table. An example of this is shown in Fig. 69. For example, a block is started by "conjunction" and ends at the end of the sentence. As a result, a block is formed that begins with "," from the top, which immediately precedes its conjunction, and the goal is a clause, while the role is a sentence. Furthermore, a further block is formed, starting from the conjunction up to the end of the sentence, in which both the goal and the role are part of a sentence or a clause.

Ferner wird ein Block gestartet durch ", Relativpronomen" und wird beendet durch "," oder am Ende des Satzes. Wie in diesem Fall wird die Möglichkeit für eine Anzahl von Endbedingungen für eine obere Bedingung zugelassen. Für den Fall, daß der Block durch "," beendet wird, bildet eine Stelle (cluster) aus ",", das dem Relativpronomen bis zu dem nächsten Auftreten "," vorausgeht, einen Block, in welchem das Ziel ein Satzteil bzw. eine Klausel und die Rolle ein Adverb oder ein Adjektiv ist. Das heißt, dies bedeutet, daß das Gebilde als eine Adverb- oder eine Adjektiv- Klausel fungiert. In dem Fall einer Endung am Ende des Satzes stellt ein Gebildet aus "," das dem Relativpronomen am Ende des Satzes vorausgeht, einem Block, in welchem das Ziel eine Klausel ist, und die Rolle ein Adverb oder ein Adjektiv ist. Dies gibt es entsprechend den Bedingungen zum Ausbilden einer Gruppe, einer Klausel oder eines Satzes, die in üblichen modernen englischen Sätzen erscheinen. In der Fig. stellt das Symbol "" einen Zwischenraum dar. In dem Abschnitt 9016 für eine morphologische Analyse wird der englische Text, welcher von der Vorredigierung 9014 eingegeben ist, zuerst in Sätze als die Übersetzungseinheiten aufgeteilt. In diesem Fall werden fehlerhaftes Buchstabieren oder nichtregistrierte Worte festgestellt. Das Wörterbuch 9018 wird bei den jeweiligen Satzeinheiten abgerufen und die Wörterbuch-Information für jeden der Bestandteile wird herbeigeholt. Verschiedene Anordnungsmoden werden entsprechend diesen Wörterbuch-Informationen durchgeführt.Furthermore, a block is started by ", relative pronoun" and is ended by "," or at the end of the sentence. As in this case, the possibility for a number of final conditions for an upper condition is allowed. In the event that the block is ended by ",", a point (cluster) forms "," which precedes the relative pronoun until the next occurrence ",", a block in which the target is a sentence or a sentence Clause and the role is an adverb or an adjective. That is, this means that the entity acts as an adverb or an adjective clause. In the case of an ending at the end of the sentence, an educated person "that precedes the relative pronoun at the end of the sentence, a block in which the goal is a clause and the role is an adverb or adjective. This is according to the conditions for forming a group, clause or sentence that appear in common modern English sentences. In the figure , the symbol "" represents a space. In the morphological analysis section 9016 , the English text input from the pre-editing 9014 is first divided into sentences as the translation units. In this case, incorrect spelling or unregistered words are detected. The dictionary 9018 is retrieved from the respective sentence units and the dictionary information for each of the components is fetched. Different arrangement modes are carried out according to this dictionary information.

Fig. 66 zeigt ein Flußdiagramm für die kollektive Zusammenstellung eines Blockes, der in dem Abschnitt 9016 für morphologische Analyse durchgeführt worden ist. Zuerst wird ein Positionszeiger, welcher die Leseposition für einen englischen Satz anzeigt, an dem oberen Ende gesetzt (9100). Das obere Ende bedeutet nicht das Wort am oberen Ende, sondern das obere (imaginäre) Ende des gerade vorangehenden Satzes. Das Wort-Herausnehmen 9101 wird an dieser Stelle ausgeführt. Wie in Fig. 67 dargestellt, wird bei dem Vorgang 9101 ein Wort herausgenommen, indem die Position um eins vorrückt (9111), wenn es nicht am Ende des Satzes steht (9110), und das Wörterbuch 9018 wird für das Wort abgerufen (9112), um die Wortinformation auszuschreiben (9113). Figure 66 shows a flow diagram for the collective assembly of a block performed in section 9016 for morphological analysis. First, a position pointer indicating the reading position for an English sentence is set at the upper end ( 9100 ). The upper end does not mean the word at the upper end, but the upper (imaginary) end of the sentence just preceding. Word extract 9101 is performed at this point. As shown in FIG. 67, in operation 9101, a word is taken out by advancing the position by one ( 9111 ) if it is not at the end of the sentence ( 9110 ), and dictionary 9018 is retrieved for the word ( 9112 ) to write out the word information ( 9113 ).

Wenn auf diese Weise die Wortinformation bei dem Verarbeitungsvorgang 9101 herausgenommen wird, wird auf die Tabelle 9036 für die Bedingungen am Anfang und Ende des Blockes Bezug genommen, um zu beurteilen, ob es irgendwas gibt, das zu der oberen Bedingung paßt oder nicht (9102). Auf diese Weise werden die Schritte 9101 und 9102 wiederholt, bis das Wort, das zu der oberen Bedingung paßt, festgestellt wird.When the word information is extracted in the process 9101 , reference is made to the table 9036 for the conditions at the beginning and end of the block to judge whether there is anything that matches the above condition or not ( 9102 ) . In this way, steps 9101 and 9102 are repeated until the word matching the above condition is found.

Wenn der oberen Bedingung entsprochen ist, werden das nächste Wort und die drauffolgenden Worte durch die geforderte Zahl herausgenommen und es wird Bezug genommen auf die Übereinstimmung mit dem oberen Zustand des Blockes (9104). In diesem Fall wird das Wörterbuch erforderlichenfalls für jedes der Worte abgerufen. Der Positionszeiger wird nicht vorgerückt.If the above condition is met, the next word and subsequent words are taken out by the requested number and reference is made to the match with the top state of the block ( 9104 ). In this case, the dictionary is retrieved for each of the words if necessary. The position pointer is not advanced.

Wenn es mit dem oberen Zustand des Blockes beim Schritt 9104 verglichen wird, dann wird ein Wort, welches mit dem Blockendzustand hinsichtlich des oberen Zustands übereinstimmt, abgerufen (9105). Die Schritte 9104 bis 9106 werden wieder durchlaufen, bis das Wort, das mit der Endbedingung übereinstimmt, gefunden ist. Wenn ein Wort mit der Endbedingung übereinstimmt (9106) wird ein sogenanntes Gebilde (cluster), welches das Wort einschließt, als ein Block erkannt, und der Block wird geschrieben (9107). Insbesondere wird ein Block vorbereitet, um zu beurteilen, ob der Block-Vorbereitungsbedingung an der Stelle genügt ist, wo der Endbedingung zuerst genügt wird. Bezüglich der Tabelle 9036 für eine Block-Vorbereitungsinformation wird die Position für das Wort, das durch den Zeiger an der Stelle angezeigt worden ist, wo sein Vorrücken bei der Verarbeitung 9103 gestoppt wurde, als die obere Position für den Block festgelegt, und die Position des Wortes, das der Endbedingung genügt, die danach zuerst erscheint, wird als die Endposition für den Block festgelegt. Gleichzeitig werden das Ziel und die Rolle des Blockes geschrieben.If compared to the top state of the block at step 9104 , then a word that matches the end of block state with respect to the top state is retrieved ( 9105 ). Steps 9104 through 9106 are repeated until the word that matches the end condition is found. If a word matches the end condition ( 9106 ), a so-called cluster that includes the word is recognized as a block and the block is written ( 9107 ). In particular, a block is prepared to judge whether the block preparation condition is satisfied at the point where the end condition is satisfied first. Regarding the block preparation information table 9036 , the position for the word indicated by the pointer at the point where its advancement in processing 9103 is stopped is set as the top position for the block and the position of the Word that meets the ending condition that appears first after that is set as the ending position for the block. At the same time, the goal and role of the block are written.

Wenn als ein Ergebnis einer solchen Blockerkennung " . . ., Konjunktion . . ." beispielsweise in dem englischen Satz steht, wie in Fig. 70 dargestellt ist, wird das Gebilde von dem oberen Ende des Satzes an dem Teil der "," vorausgeht, als ein Block erkannt, während das Gebilde aus ", Konjunktion" am Ende des Satzes als ein anderer Block erkannt wird. In der Fig. zeigt das Innere von [ ] einen Block an. In diesem Block sind sowohl das Ziel als auch die Rolle Sätze. Ferner bildet ein Gebilde aus dem Wort nach der Konjunktion am Ende des Satzes einen weiteren Block, in welchem sowohl das Ziel als auch die Rolle Sentenzen sind. Andererseits kann das Gebilde aus der Konjunktion am Ende des Satzes als ein Block festgelegt sein. In diesem Fall ist das Ziel ein Satzteil bzw. eine Klausel und die Rolle ist ein Adverb.If, as a result of such block detection, "... Conjunction..." for example, in the English sentence, as shown in Fig. 70, the entity from the top of the sentence at the portion preceding "," is recognized as a block, while the entity from ", conjunction" is recognized at the end of the sentence is recognized as another block. In the figure , the inside of [] indicates a block. In this block, both the goal and the role are sentences. Furthermore, a structure from the word after the conjunction at the end of the sentence forms a further block in which both the goal and the role are sentences. On the other hand, the entity from the conjunction at the end of the sentence can be defined as a block. In this case the goal is a phrase or a clause and the role is an adverb.

Der Block kann auch so festgelegt werden, wie er von der Position aus beginnt, die kein "," enthält. Ferner kann die Interpunktion u. ä. von dem Gegenstand für die grammatikalische Analyse als die Information ausgeschlossen werden, die in dem Block vorhanden ist. Wenn auf dieselbe Weise ". . . Relativpronomen . . ." vorhanden ist, kann ein "Relativpronomen . . ." als ein Block erkannt werden. In diesem Block ist das Ziel eine Klausel oder eine Sentenz und die Rolle ist ein Adverb oder ein Adjektiv. The block can also be set as it is by the Position off that does not contain a ",". Furthermore, the Punctuation and Ä. of the subject for grammatical Analysis as the information to be excluded that exists in the block. If in the same way ". . . Relative pronoun . . ." is present, a "relative pronoun..." be recognized as a block. In this block the goal is a clause or a phrase and the role is an adverb or an adjective.

Der Block kann natürlich auch in einer Neststruktur ausgeführt sein. Wenn beispielsweise der englische Satz einen derartigen Aufbau hat wie "(Anfang des Satzes" . . ., Konjunktionen, . . ., Relativpronomen . . .") Ende des Satzes, wie beispielsweise in Fig. 71 dargestellt ist, dann stellt das Gebilde von ",Konjunktion" bis zum Ende des Satzes ein Block BL1-BL1, in welchem ", Relativpronomen . . .," als ein anderer Block BL2-BL2 enthalten ist.The block can of course also be designed in a nest structure. For example, if the English sentence has a structure such as "(beginning of sentence"..., Conjunctions,..., Relative pronouns... ") End of the sentence, as shown in Fig. 71 for example, then the construct of ", Conjunction" until the end of the sentence a block BL1-BL1, in which ", relative pronoun. . . "is contained as another block BL2-BL2.

Auf diese Weise unterscheidet der morphologische Analyseabschnitt 9016 das Merkmal des Satzes im Hinblick auf die Form und das Vokabular, um die strukturelle Zusammensetzung als einen Block zu unterscheiden. Ferner führt zusätzlich zu einer solchen Blockerkennung der Abschnitt 9016 verschiedene Verarbeitungen durch, wie einen Ausdruck für einen Eigennamen, ein abgeleitetes, unbekanntes Wort, ein abgekürztes Wort, ein numerisches, ein Zeit-, ein mit Bindestrich versehenes Wort, einen Apostroph (′) sowie eine Appositions-Bewertung und eine Verarbeitung für eine Zusatz-Frage (tag question), um morphologische Analysedaten vorzubereiten.In this way, the morphological analysis section 9016 distinguishes the feature of the sentence in terms of shape and vocabulary to distinguish the structural composition as a block. Furthermore, in addition to such a block recognition, section 9016 carries out various processing operations, such as an expression for a proper name, a derived, unknown word, an abbreviated word, a numerical, a time, a hyphenated word, an apostrophe (′) and an apposition assessment and processing for an additional question (tag question) in order to prepare morphological analysis data.

Der englische Satz, der auf diese Weise einer morphologischen Analyse unterzogen worden ist, wird zusammen mit der analysierten Information an den syntaktischen Analyseabschnitt I 9020 übertragen. Fig. 72 zeigt ein Beispiel der ausgegebenen Daten. Fig. 72 das Ergebnis, daß ein englischer Satz I "White House is′nt white" von dem Eingabeabschnitt 9010 eingegeben wird und in dem Abschnitt 9016 morphologisch analysiert wird. Der Block 1 wird an der Wort-Position #4 begonnen und an der Position #10 beendet, an welchen in diesem Fall sowohl das Ziel als auch die Rolle freigestellt sind. Auf diese Weise wird der Block 2 an der Position #5 begonnen und an der Position #6 beendet, wobei das Ziel eine Hauptwortgruppe ist, während die Rolle ein Eigenname ist. Das heißt, der Block "White House is′nt white." enthält in einem anderen Block White House als ein Nest. In einem Block, d. h. einem kleineren Block "White House" fungiert jeder der Bestandteile im Inneren als ein Eigenname, während es eine Position als eine Wortklausel relativ zu dem Äußeren besitzt, d. h. "is′nt white." . "White House" kann als eine Einheit behandelt werden.The English sentence, which has been subjected to a morphological analysis in this way, is transmitted together with the analyzed information to the syntactical analysis section I 9020 . Fig. 72 shows an example of the output data. Fig. 72, the result is that an English sentence I "White House is'nt white" from the input section 9010 is input, and is morphologically analyzed in the section 9016th Block 1 is started at word position # 4 and ended at position # 10, in which case both the target and the role are optional. In this way, block 2 is started at position # 5 and ended at position # 6, the target being a noun group, while the role is a proper name. That is, the block "White House is'nt white." contains white house in a different block than a nest. In a block, ie a smaller "White House" block, each of the components inside functions as a proper name, while it has a position as a word clause relative to the exterior, ie "is'nt white." . "White House" can be treated as a unit.

Zusammen mit einer solchen Blockinformation wird Wortinformation, die aus dem Wort-Wörterbuch 9018 abgerufen worden ist, hinzugefügt und von dem morphologischen Analyseabschnitt 9016 an den syntaktischen Analyseabschnitt I 9020 abgegeben. Der Abschnitt I 9020 analysiert die Oberflächenschicht- Struktur des englischen Satzes durch Anwenden einer kontext-freien Grammatikregel, welche in der Regeldatei 9036 gespeichert ist, um alle möglichen strukturellen Bäume herauszufinden. Wenn in diesem Fall ein Block enthalten ist, wird die oben beschriebene partielle grammatikalische Analyse durchgeführt, während ein Bezug zu der lokalen Analyse gegeben wird. Hierdurch kann der Wirkungsgrad und die Genauigkeit der jeweiligen Analyse verbessert werden.Together with such block information, word information which has been retrieved from the word dictionary 9018 is added and output from the morphological analysis section 9016 to the syntactical analysis section I 9020 . Section I 9020 analyzes the surface layer structure of the English sentence by applying a context-free grammar rule, which is stored in the rule file 9036 , in order to find out all possible structural trees. In this case, if a block is included, the partial grammatical analysis described above is performed while referring to the local analysis. As a result, the efficiency and accuracy of the respective analysis can be improved.

Insbesondere wird die Block-Einschlußbeziehung von der Positionsinformation für den Block vorbereitet. Dann wird der innerste Block analysiert. Der Block, der mit dieser grammatikalischen Analyse beendet worden ist, wird als eine Einheit betrachtet, und es wird keine weitere Verarbeitung in dessen Inneren durchgeführt. Auf diese Weise wird der Bereich der grammatikalischen Analyse allmählich zu den außenliegenden Blöcken vergrößert. Schließlich ist der ganze Satz analysiert. Das jeweilige Analysieren wird, basierend auf der cfg- Regel in der Weise von oben nach unten und von rechts nach links in dem englischen Satz durchgeführt. Das grammatikalische Analysieren wird in einer Weise durchgeführt, wobei alle Möglichkeiten erhalten bleiben, welche durch die Grammatikregeln zugelassen sind.In particular, the block inclusion relationship is determined by the position information prepared for the block. Then the innermost block analyzed. The block that comes with this grammatical Analysis has ended as a unit considered, and no further processing in its Performed inside. This way the area of grammatical analysis gradually to the outside Blocks enlarged. Finally the whole sentence is analyzed. The respective analysis is based on the cfg Usually from top to bottom and right to top performed on the left in the English sentence. The grammatical Analyze is done in a manner where all possibilities remain, which are given by the grammar rules allowed are.

In Fig. 73 ist ein Beispiel für einen solchen Analyse-Verarbeitungsfluß dargestellt. Zuerst werden, basierend auf den englischen Daten, die dem Abschnitt I 9020 zugeführt worden sind, alle strukturellen Zusammensetzungen für einen Satz als Blöcke erkannt, und es werden das Ziel und die Rolle bewertet (9120). Der Weg zu einer Zusammenstellung ist so, wie in Fig. 70 dargestellt. Wenn dann kein Block in einer derartigen Anordnung vorhanden ist (9121), wird der Satz analysiert (9125) und nur ein Satz, der kollektiv als ein Symbol zusammengestellt ist, wird ausgewählt, und die Analyse für den Satz wird beendet (9126). Da die Verarbeitungsschritte 9125 und 9126 in den Verarbeitungsschritten 9121 bis 9124 eingeschlossen sind, wenn das Verarbeitungssystem das den ganzen Satz als einen Block behandelt, verwendet wird, und demgemäß sind sie nicht unnötig.An example of such an analysis processing flow is shown in FIG . First, based on the English data supplied to Section I 9020 , all structural compositions for a set are recognized as blocks and the goal and role are evaluated ( 9120 ). The route to a compilation is as shown in Fig. 70. Then, if there is no block in such an arrangement ( 9121 ), the sentence is analyzed ( 9125 ) and only a sentence collectively composed as a symbol is selected and the analysis for the sentence is ended ( 9126 ). Since processing steps 9125 and 9126 are included in processing steps 9121 to 9124 when the processing system treats the whole sentence as a block, they are not unnecessary.

Wenn ein Block vorhanden ist, wird der innerste Block zuerst analysiert (9122). In dem in Fig. 71 dargestellten Beispiel wird das Innere des Blocks BL2-BL2 analysiert. Obwohl verschiedene Lösungen im allgemeinen durch das grammatikalische Analysieren erhalten werden, wird eine Lösung, in welcher der Block kollektiv als ein cfg-Symbol zusammengestellt ist, und welcher mit dem Ziel des Blocks übereinstimmt, die diesen Lösungen ausgewählt (9123). In diesem Fall wird im Hinblick auf das freigestellte Ziel für den Block alles das ausgewählt, was in einem Symbol angeordnet ist. Alles, was auf diese Weise ausgewählt worden ist, wird dann als eine einzige Anordnung behandelt, welche die Rolle für den Block hat (9124). In dem Block mit der optionalen Rolle wird eine Rolle des Symbols, welche bei der Verarbeitung 9123 ausgewählt wird als die Rolle definiert. Die Verarbeitungsvorgänge 9121 bis 9124 werden anschließend nacheinander wiederholt.If there is a block, the innermost block is analyzed first ( 9122 ). In the example shown in Fig. 71, the inside of the block BL2-BL2 is analyzed. Although various solutions are generally obtained by grammatical parsing, a solution in which the block is collectively assembled as a cfg symbol and which matches the target of the block, selects these solutions ( 9123 ). In this case, all that is arranged in a symbol is selected for the block with regard to the optional target. Everything that has been selected in this way is then treated as a single arrangement that has the role for the block ( 9124 ). In the block with the optional role, a role of the symbol which is selected in processing 9123 is defined as the role. The processing operations 9121 to 9124 are then repeated one after the other.

Auf diese Weise werden im Beispiel der Fig. 71 das Innere des Blocks BL2-BL2 als erstes und dann das Innere des Blocks BL1-BL1 grammatikalische analysiert. In diesem Fall wird der Block BL2-BL2 gleich wie ein einzelnes Wort behandelt, und jeder der darin enthaltenen Bestandteile wird nicht grammatikalisch analysiert. In this way, in the example of Fig. 71, the inside of the block BL2-BL2 is first grammatically analyzed and then the inside of the block BL1-BL1. In this case, block BL2-BL2 is treated in the same way as a single word and each of the components contained therein is not analyzed grammatically.

Wenn auf diese Weise die Daten, welche die strukturelle Zusammensetzung und die untergeordnete Beziehung festlegen, gehalten werden, werden sie an den syntaktischen Analyseabschnitt II 9022 abgegeben. Die Daten können dann leicht in der Form eines Strukturbaums erkannt werden, wie oben beschrieben ist. Die Daten werden dann weiter in die Struktur des japanischen Satzes in dem Struktur-Umwandlungsabschnitt 9024 und in dem Übersetzungsabschnitt 9026 umgeformt; es wird dann ein übersetzter Satz für jeden der darin enthaltenen Knoten erzeugt. Die Knotenverarbeitung in dem Strukturbau wird in der Weise von oben nach unten und von links nach rechts durchgeführt.If the data defining the structural composition and the subordinate relationship are held in this way, they are delivered to the syntactical analysis section II 9022 . The data can then be easily recognized in the form of a structure tree, as described above. The data is then further transformed into the structure of the Japanese sentence in the structure converting section 9024 and in the translation section 9026 ; a translated sentence is then generated for each of the nodes contained therein. The node processing in the structure construction is carried out in this way from top to bottom and from left to right.

Der auf diese Weise erzeugte, übersetzte Satz wird dann einer Nachverarbeitung in dem Nachredigierabschnitt 9030 unterzogen, wird auf dem Bedienungs-Anzeigeabschnitt 9040 visuell dargestellt und beispielsweise als ein japanischer Satz 9034 in dem Ausgabeabschnitt 9032 ausgedruckt.The set so generated, translated is then subjected to post-processing in the Nachredigierabschnitt 9030, is visually displayed on the display section 9040 and Operation for example, printed out as a Japanese sentence 9034 in the output section 9032nd

Auf diese Weise wird gemäß dieser Ausführungsform das Merkmal des englischen Satzes im Hinblick auf die Form und das Vokabular unterschieden, um so die strukturelle Zusammenstellung als einen Block zu unterscheiden. Für den Block werden ein Ziel, welches das Analyseergebnis sein kann, und die strukturelle Rolle, mit welcher der Block nach außen fungiert, bewertet. Dann wird die Oberflächenschicht-Struktur des englischen Satzes analysiert, indem eine kontextfreie Grammatikregel angewendet wird, um alle möglichen Strukturbäume herauszufinden. Hierdurch kann die Anzahl an unwirtschaftlichen Lösungen verringert, der Wirkungsgrad bei der grammatikalischen Analyse verbessert werden sowie ein zuverlässigeres Analyseergebnis geschaffen werden.In this way, according to this embodiment, the feature of the English sentence with regard to the form and the Vocabulary differentiated, so the structural composition to distinguish as a block. For the block become a goal, which can be the analysis result, and the structural role with which the block is outward acts, evaluates. Then the surface layer structure of the English sentence analyzed by a context-free Grammar rule is applied to all possible Find out structure trees. This can increase the number of uneconomical solutions, the efficiency at grammatical analysis are improved as well as a more reliable analysis result can be created.

Somit gibt es verschiedene Muster für einen hinzuzufügenden Ausdruck, und es ist schwierig, diese bei der grammatikalischen Analyse, insbesondere bei einer kontextfreien Analyse zu erkennen, da es im Hinblick auf die vorstehenden Ausführungen im allgemeinen schwierig ist, die Appositions-Erkennung nach der Analyse durchzuführen, ist eine zweideutige Übersetzung unvermeidlich. Selbst wenn eine Regel vorbereitet wurde, mit welcher sie zu erkennen sind, würde es ein Risiko sein, das kein hinzuzufügender appositionaler Ausdruck als ein identischer Fall erkannt wird, oder die Anzahl von möglichen Kombinationen wird bezeichnend. Das heißt, es wird eine kostspielige lokale Analyse zwischen den Teilen, welche in dem appositionalen Ausdruck enthalten sind, und den anderen Teilen durchgeführt.So there are different patterns for one to add Expression, and it is difficult to do this at grammatical Analysis, especially with context-free analysis recognizable as it is in view of the foregoing is generally difficult to post-apposition detection performing the analysis is an ambiguous translation inevitable. Even if a rule has been prepared with whichever they are, it would be a risk that no appositional expression to be added as an identical one Case is detected, or the number of possible combinations becomes significant. That said, it will be an expensive one local analysis between the parts contained in the appositional expression are included, and the other parts carried out.

Im Hinblick darauf kann in der vorliegenden Ausführungsform die Last bei der Verarbeitung in dem Analyseschritt gemildert werden, indem der appositionale Ausdruck durch das Merkmal des Satzes im Hinblick auf die Form oder die semantischen Merkmale von Worten erkannt wird. Eine Bewertung der Apposition wird durchgeführt, indem das nächste Muster als ein Block erkannt wird.In view of this, in the present embodiment alleviated the load in processing in the analysis step by the appositional expression by the Characteristic of the sentence in terms of form or semantic Characteristics of words is recognized. An assessment of the Apposition is done by using the next pattern as a block is recognized.

Für die englische Satzstruktur "∼, Relativpronomen, ∼.∼" wird das Relativpronomen dadurch erkannt, daß der Sprachteilkode für das Wort mit einem speziellen Kode, beispielsweise "R" versehen wird. In diesem Fall wird das Innere, das mit "," umgeben ist, als ein Block unter der Voraussetzung betrachtet, daß er sich nicht mit einem Block oder einer Einheit überschneidet, was in der Vorredigierung angezeigt worden ist, und er nicht "und" oder "oder" in dem Teil nach dem zweiten "," enthält. Für die englische Satzstruktur "∼, Relativpronomen ∼." wird das Innere, das von "," und "," eingeschlossen ist, als ein Block betrachtet. Der Punkt kann ein anderes Symbol sein, welches für das Satzende verwendet wird.For the English sentence structure "∼, relative pronouns, ∼.∼" the relative pronoun is recognized in that the language part code for the word with a special code, for example "R" is provided. In this case, the inside that comes with "," is considered as a block provided that he's not dealing with a block or a unit overlaps what has been shown in the pre-edit and it is not "and" or "or" in the part after second "," contains. For the English sentence structure "∼, Relative pronoun ∼. "Becomes the interior, that of", "and", " is considered as a block. The point can be another symbol that is used for the end of the sentence becomes.

Zum Durchführen einer solchen Appositions-Bewertung ist das Wörterbuch 9018 entsprechend ausgebildet, um die Bedeutungsinformation für Worte zu speichern. Die Bedeutungsinformation stellt den Unterschied für den Artikel, den Ort, die Person usw. dar, wie in Fig. 74 dargestellt ist. Auch für die Block-Vorbereitungsbedingung ist die Tabelle 9036 so ausgebildet, wie in Fig. 75 dargestellt ist, so daß das obere Ende des Blockes durch "Eigennamen (Person), Hauptwort (Person)" als die obere Bedingung erkannt wird, und das obere Ende des Blockes wird durch "Eigennamen (Person), Artikel Hauptwort (Person)" erkannt. Es ist folglich möglich, den appositionalen Ausdruck im Hinblick auf die morphologischen und semantischen Merkmale zu bewerten, ohne Durchführen einer grammatikalischen Analyse, und um eine grammatikalische Analyse entsprechend der Appositions-Wertung in dem in Fig. 64 dargestellten Beispiel für andere Verarbeitungsvorgänge durchzuführen.The dictionary 9018 is designed to carry out such an apposition evaluation in order to store the meaning information for words. The meaning information represents the difference for the article, location, person, etc., as shown in Fig. 74. Also for the block preparation condition, the table 9036 is constructed as shown in Fig. 75, so that the top of the block is recognized as the top condition by "proper name (person), noun (person)" and the top End of the block is recognized by "proper name (person), article noun (person)". It is thus possible to evaluate the appositional expression in terms of the morphological and semantic characteristics without performing a grammatical analysis and to perform a grammatical analysis according to the appositions evaluation in the example shown in FIG. 64 for other processing operations.

Im übrigen sind es in dem englischen Satz diese Gruppen, die eine äußerst spezielle Information tragen und nur begrenzt verwendet werden. Wenn sie auf dieselbe Weise analysiert werden, wie dies für die üblichen Gruppen erfolgt, werden sie bezüglich einer ganz anderen Satzart analysiert, und es ist schwierig, die ursprüngliche Beschaffenheit des Satzes durch Analysieren zu erhalten. Ferner führt dies zu großen Einbußen.Incidentally, in the English sentence it is these groups that carry a very special information and only limited be used. If analyzed in the same way will be, as is done for the usual groups analyzed them for a completely different type of sentence, and it is difficult the original nature of the sentence to get by analyzing. Furthermore, this leads to large ones Loss.

Beispielsweise wird "let′s" oder "let us" unmittelbar nach der Interpunktion usw. als ein Befehlssatz mit dem kausativen Verb "let" analysiert, und es sollte als eine Gruppe analysiert werden, welche den einladenden Charakter "let" ("laßt uns" u. ä.) hat. "Let" ("lassen") kann auch verschiedentlich als ein transitives Verb gebraucht werden und sogar als ein Hauptwort in der Bedeutung "Vermietung" und ist nicht auf den Gebrauch von freundlichen Einladungen oder Aufforderungen und nicht auf den Gebrauch als Hilfsverb beschränkt. Folglich muß die Analyse für die jeweiligen Möglichkeiten durchgeführt werden, wodurch der Wirkungsgrad herabgesetzt wird. Ferner ist es schwierig, den Gebrauch als Einladung aus dem Ergebnis der Analyse herbeizuführen, da es außer im Hinblick auf die Satzstruktur keinen Unterschied zwischen dem Gebrauch als Kausation und als Einladung gibt, und es ist schwierig dies nur im Hinblick auf die Satzstruktur zu unterscheiden.For example, "let's" or "let us" immediately after the punctuation etc. as a command set with the causative Verb "let" parses and it should be as a group be analyzed, which the inviting character "let" ("let's" and the like). "Let" can also be used in various ways be used as a transitive verb and even as a noun in the meaning "rental" and is not on the use of friendly invitations or Prompts and not limited to use as an auxiliary verb. Consequently, the analysis has to be done for the respective possibilities be carried out, reducing the efficiency is reduced. Furthermore, it is difficult to use as Invite to result from the result of the analysis as it is except for the sentence structure, no difference between use as a causation and as an invitation, and it's difficult just because of the sentence structure to distinguish.

Der unwirtschaftliche Verlust im Verlauf einer Analyse kann dadurch gemindert werden, daß "let′s" oder "let us" ("laßt uns") unmittelbar nach der Interpunktion von dem Gegenstand der Analyse ausgeschlossen wird. Durch ein Absondern von dem grundsätzlichen Gebrauch des Worts, d. h. von dem kausativen Gebrauch kann eine semantische Analyse ohne weiteres durchgeführt werden.The uneconomical loss in the course of an analysis can can be reduced by "let's" or "let us" ("let us ") immediately after the punctuation of the object the analysis is excluded. By separating from the basic use of the word, d. H. from the causative Semantic analysis can easily be used be performed.

Wenn "please" ("bitte"), "let′s" oder "let us" ("laßt uns") am Anfang des Blockes erscheint, wird ein Flag für die Blockinformation eingesetzt, und dies wird für jeden der Fälle in einer Information für die Einheit nicht erteilt. Beispielsweise wird der englische Satz "let′s go to school." ("laßt uns zur Schule gehen") verarbeitet als < go to school < ("zur Schule gehen") < woran let′s ("laßt uns") angehängt ist <.If "please", "let′s" or "let us" appears at the beginning of the block, there is a flag for the Block information is used, and this is for each of the Cases not given in information for the unit. For example, the English phrase "let’s go to school." ("Let's go to school") processed as <go to school < ("go to school") <to which let's ("let's") appended is <.

Um eine derartige "let"-Verarbeitung durchzuführen, wird ein let-Informations-Verarbeitungsabschnitt 9200 zwischen dem morphologischen Analyseabschnitt 9016 und dem syntaktischen Analyseabschnitt I 9020 in einer Abwandlung dieser Ausführungsform angeordnet. Fig. 77 zeigt die betreffenden Abschnitte. In dieser Fig. sind dieselben Elemente wie diejenigen, die in Fig. 64 dargestellt sind, mit denselben Bezugszeichen bezeichnet.In order to perform such "let" processing, a let information processing section 9200 is arranged between the morphological analysis section 9016 and the syntactical analysis section I 9020 in a modification of this embodiment. Fig. 77 shows the parts in question. In this figure , the same elements as those shown in Fig. 64 are given the same reference numerals.

Ferner ist das Wörterbuch 9018 entsprechend ausgeführt, um die let-Information für das Wort zu speichern. Wie in Fig. 78 dargestellt, schafft die let-Information "0" für gewöhnliche Worte, "1" für "let′s" und "let us" und "2" für "please".Dictionary 9018 is also designed to store let information for the word. As shown in Fig. 78, let information creates "0" for ordinary words, "1" for "let's" and "let us" and "2" for "please".

Der Abschnitt 9200 für die let-Informationsverarbeitung hat die Aufgabe, das Ergebnis der morphologischen Analyse zusammen mit dem eingegebenen englischen Satz von dem Abschnitt 9016 aufzunehmen und die let-Information als die zusätzliche Information zu der Information eines Wortes während der Analyse hinzuzufügen, wie in Fig. 79 dargestellt ist. In diesem Fall wird ein Block für den Satz angeordnet. In dem in der Figur dargestellten Beispiel ist der Block 0 (Start: 1, Ende: 10, Ziel: Satz, Rolle: Satz). Das heißt, der Block in diesem Beispiel schließt außer einem Satz zusätzlich eine Klausel, eine Gruppe usw. ein. In diesem Fall schließt der Begriff für den Block auch einen Absatz und den ganzen Satz ein, die jeweils als ein Block betrachtet werden können. Ferner wird eine "transitive Verbwurzel (die mit ′s angehängt ist)" als ein Sprachteil zu "let′s" für die Information des Wortes geschrieben, und die let-Information ist "1".The section 9200 for let information processing has the task of taking the result of the morphological analysis together with the input English sentence from section 9016 and adding the let information as the additional information to the information of a word during the analysis as shown in FIG . is shown 79th In this case, a block is arranged for the sentence. In the example shown in the figure, block is 0 (start: 1, end: 10, goal: sentence, role: sentence). That is, the block in this example includes a clause, a group, etc., in addition to a sentence. In this case, the term for the block also includes a paragraph and the entire sentence, each of which can be viewed as a block. Further, a "transitive verb root (which is appended with 's)" is written as a language part to "let's" for information of the word, and the let information is "1".

Wie in Fig. 81 dargestellt, wird eine Verarbeitung zum Vorbereiten eines Blocks des Satzes 9300 vor dem Starten der kollektiven Zusammenstellung des Blocks zu dem eingegebenen englischen Satz durchgeführt. Die anschließenden Verarbeitungsvorgänge sind dieselben, wie die in Fig. 66 dargestellten Flußdiagramme. Beispielsweise wird in dem englischen Satz, ich sagte, "let′s go to school", (ich sagte, "laßt uns zur Schule gehen".) der Block 0 gebildet (Start: Anfang des Satzes, Ende : Ende des Satzes, Rolle : Satz, Ziel : Satz).As shown in Fig. 81, processing to prepare a block of sentence 9300 is performed before starting the collective assembly of the block into the input English sentence. The subsequent processing operations are the same as the flowcharts shown in Fig. 66. For example, in the English sentence, I said "let’s go to school", (I said, "let's go to school") block 0 is formed (start: beginning of the sentence, end: end of the sentence, role : Sentence, goal: sentence).

Wie in Fig. 81 dargestellt, wird in dem syntaktischen Analyseabschnitt I 9020 jede der strukturellen Zusammensetzungen als ein Block erkannt, der auf den ihm zugeführten englischen Daten basiert, und deren Ziel und deren Rolle wird entsprechend bewertet (9120). Wenn der Block in der Anordnung nicht vorhanden ist, (9121) wird die Analyse beendet. Wenn Blöcke in dem Eingangssatz vorhanden sind, wird der innerste Block zuerst analysiert (9122). Obwohl im allgemeinen verschiedene Lösungen durch das Analysieren erhalten werden, wird nur die Lösung, welche kollektiv als ein cfg-Symbol angeordnet ist, unter diesen ausgewählt (9123). Die anschließenden Verarbeitungsvorgänge sind dieselben wie die in Fig. 23. As shown in Fig. 81, in the syntactical analysis section I 9020, each of the structural compositions is recognized as a block based on the English data supplied to it, and its goal and role are evaluated accordingly ( 9120 ). If the block does not exist in the array, ( 9121 ) the analysis is ended. If there are blocks in the input set, the innermost block is analyzed first ( 9122 ). Although various solutions are generally obtained by analyzing, only the solution collectively arranged as a cfg symbol is selected from among them ( 9123 ). The subsequent processing operations are the same as those in FIG. 23.

Eine derartige Verarbeitung einer "let"-Information wird in dem Abschnitt 9200 entsprechend den in Fig. 83A und 83B dargestellten Verarbeitungsflüssen durchgeführt. Zuerst wird ein Zeiger auf den oberen Block gesetzt (9330), um das Wort zu prüfen, das am oberen Ende oder Anfang des Blockes angeordnet ist (9331). Wenn die let-Information "0" ist, wird der Zeiger schrittweise vorgerückt (9339), um das nächste Wort hinüberzuschaffen.Such processing of "let" information is performed in section 9200 according to the processing flows shown in Figs. 83A and 83B. First, a pointer is placed on the top block ( 9330 ) to check the word located at the top or beginning of the block ( 9331 ). If the let information is "0", the pointer is incremented ( 9339 ) to get the next word over.

Wenn die let-Information nicht "0" wird die vorhergehende Wörterbuch-Bezugseinheit überprüft (9322). Wenn es keine Interpunktion ist oder wenn der Zeiger nicht das obere Ende oder den Anfang bezeichnet wird der Zeiger schrittweise vorgerückt (9339), um zu dem nächsten Wort überzugehen. Wenn die vorhergehende Wörterbuch-Bezugseinheit bei der Überprüfung eine Interpunktion ist, oder wenn der Zeiger das obere Ende oder den Anfang anzeigt, wird der innerste Schichtblock, welcher das Wort enthält, markiert (9333).If the let information is not "0", the previous dictionary reference unit is checked ( 9322 ). If it is not punctuation, or if the pointer does not indicate the top or beginning, the pointer is incremented ( 9339 ) to proceed to the next word. If the previous dictionary reference unit is a punctuation in the check, or if the pointer indicates the top or the beginning, the innermost layer block containing the word is marked ( 9333 ).

Wenn dann die let-Information "1" ist (9334), da unmittelbar nach der Interpunktion dies "let′s" oder "let us" ist, wird die Rolle des markierten Blockes als (Einladungssatz) erkannt (9336). Wenn die Information "2" ist, da es "please" ist, wird die Rolle des markierten Blockes als "Aufforderungssatz" erkannt (9335). Dann wird das Ziel des markierten Blockes als ein Befehlssatz erkannt (9337) und die Wortinformation, welche durch den Zeiger angezeigt worden ist, wird gelöscht (9338). Dann wird der Zeiger schrittweise vorgerückt (9339), um zu dem nächsten Wort überzugehen. Das Verfahren wird bis zu dem Wort an der Endposition durchgeführt (9340).If the let information is then "1" ( 9334 ), since this is "let's" or "let us" immediately after the punctuation, the role of the marked block is recognized as (invitation sentence ) ( 9336 ). If the information is "2" because it is "please", the role of the highlighted block is recognized as a "prompt set" ( 9335 ). Then the destination of the highlighted block is recognized as an instruction set ( 9337 ) and the word information indicated by the pointer is deleted ( 9338 ). Then the pointer is advanced ( 9339 ) to proceed to the next word. The procedure is carried out up to the word at the end position ( 9340 ).

Fig. 80 zeigt Beispiele für das Analyseergebnis, mit welchem eine derartige let-Informationsverarbeitung bei dem Beispiel des vorerwähnten Eingabesatzes durchgeführt worden ist: I said, "let′s go to school." (Ich sagte, "laßt uns zur Schule gehen.") Wenn der Abschnitt 9200 die let-Information zu der Information für das Wort hinzufügt, beseitigt er die Information, welche die let-Information betrifft, aus der Tabelle, und die Blockinformation wird als "Imperativsatz" als das Ziel und als ein "Einladungssatz" als die Rolle beschrieben, wie in Fig. 80 dargestellt ist. Fig. 80 shows examples of the result of analysis has been carried out with which such a let-information processing in the example of the aforementioned input sentence: I said, "let's go to school." (I said, "let's go to school.") When section 9200 adds the let information to the information for the word, it removes the information related to the let information from the table, and the block information becomes as "Imperative sentence" described as the goal and as an "invitation sentence" as the role, as shown in Fig. 80.

Wenn das Wörterbuch 9018 bei der Behandlung von mit Bindestrichen versehenen Worten in dem englischen Satz als die Gesamtheit einer Anzahl von Worten abgerufen wird, welche mit Bindestrich verbunden sind, wird, wenn deren Eingänge in dem Wörterbuch 9018 vorhanden sind, das Verfahren erfolgreich durchgeführt. Wenn für das mit Bindestrich versehene Wort das nicht in dem Wörterbuch 9018 registriert ist, der ganze Teil als ein unbekanntes Wort behandelt wird, z. B. ein Adjektiv, da die Wörterbuch-Information für jedes der mit Bindestrich verbundenen Worte nicht ausgenutzt werden kann, kann es nicht übersetzt werden. Wenn ferner der Eingang für die Information für jeden der Bestandteile der mit Bindestrich verbundenen Worte in dem Wörterbuch 9018 vorhanden ist, können sie jedoch nicht vernachlässigt werden.If dictionary 9018 is retrieved in the treatment of hyphenated words in the English sentence as the entirety of a number of hyphen-connected words, if their inputs are present in dictionary 9018 , the method is successfully performed. If for the hyphenated word that is not registered in dictionary 9018 , the whole part is treated as an unknown word, e.g. B. an adjective, since the dictionary information cannot be used for each of the words connected with a hyphen, it cannot be translated. Furthermore, if there is input for the information for each of the constituent parts of the hyphenated words in the dictionary 9018 , they cannot be neglected.

Wenn außerdem die Analyse dadurch durchgeführt wird, daß sie in die jeweiligen Bestandteile zerlegt werden, wird die Art eine Verbindung zu mit Bindestrichen versehenen Worten äußerst vielseitig.If the analysis is also carried out by: are broken down into their respective components, Art a connection to hyphenated words extremely versatile.

Um diese Schwierigkeit zu lösen, werden die ganzen mit Bindestrichen versehenen Worte als ein Adjektiv in dem Satz analysiert, und eine entsprechende Analyse wird nur für den inneren Teil der mit Bindestrich versehenen Worte durchgeführt, wobei die Bestandteile der mit Bindestrich versehenen Worte verwendet werden, und das Ergebnis dieser Analyse wird dann kombiniert. Hierdurch ist das Analysieren auch für mit Bindestrich versehene Worte ermöglicht, indem die Information für jeden der Bestandteile verwendet wird. Das heißt, für die mit Bindestrich versehenen Worte, die nicht im Wörterbuch 9018 registriert sind, wird der ganze Teil in gleicher Weise als Adjektiv behandelt. Hierbei wird auf die Substantiva, die durch den Bindestrich verbunden sind, in dem Wörterbuch verwiesen, und die entsprechende Analyse wird in einer geschlossenen Form nur für das Innere der mit Bindestrich verbundenen Worte durchgeführt.To solve this difficulty, all of the hyphenated words are analyzed as an adjective in the sentence, and a corresponding analysis is performed only for the inner part of the hyphenated words, using the components of the hyphenated words, and the result of this analysis is then combined. This also enables analysis for hyphenated words by using the information for each of the components. That is, for hyphenated words that are not registered in dictionary 9018 , the whole part is treated in the same way as an adjective. Reference is made to the nouns connected by the hyphen in the dictionary, and the corresponding analysis is carried out in a closed form only for the inside of the words connected by the hyphen.

Das heißt, wenn das mit Bindestrich verbundene Wort ein in dem Wörterbuch 9018 nicht registriertes Wort ist wird eine Blockinformation abgegeben, daß der ganze Teil als ein Block betrachtet wird, und der Wörterbuchbezug wird für jeden der Bestandteile für das Innere des Blockes durchgeführt, um die jeweiligen Einheiten-Informationen, in welchen der Bindestrich nicht eingeschlossen ist, auszuführen. Für das in dem Bezugs-Wörterbuch nicht-registrierte Wort wird eine Endbewertungs- Verarbeitung zusammen mit einer unbekannten Wortverarbeitung als eine Art von Verarbeitungsvorgang für unbekannte Worte durchgeführt.That is, if the hyphenated word is a word not registered in the dictionary 9018 , block information is given that the whole part is regarded as a block, and the dictionary reference is performed for each of the constituents for the inside of the block by the respective unit information in which the hyphen is not included. For the word not registered in the reference dictionary, final evaluation processing is performed together with an unknown word processing as a kind of processing operation for unknown words.

Eine derartige Verarbeitung für mit Bindestrich versehene Worte kann in dem Beispiel der in Fig. 64 dargestellten Struktur durchgeführt werden. In diesem Fall wird die Position des Wortes in dem Satz nicht durch die Zahl, die an dem Wort angebracht ist, sondern durch die Anzahl Zeichen von dem Anfang des Satzes aus, d. h. durch die Zeichenanzahl, ausgedrückt.Such processing for hyphenated words can be performed in the example of the structure shown in FIG. 64. In this case, the position of the word in the sentence is expressed not by the number attached to the word but by the number of characters from the beginning of the sentence, ie the number of characters.

In Fig. 84 ist ein Beispiel für die Verarbeitung von mit Bindestrich versehenen Worten dargestellt, welche in dem morphologischen Analyseabschnitt 9016 durchgeführt wird. Für den eingegebenen englischen Satz, beispielsweise "The anti-war attitude is her open-door policy" ("Die Anti-Kriegs-Haltung ist ihre Politik der offenen Tür."). Wird der Positionszeiger schrittweise vorgerückt, um ein Wort herauszunehmen (9135) und führt dann ein Wörterbuch-Abrufen durch (9353). In diesem Fall ist der Bindestrich nicht als eine Abgrenzung für das Wort verwendet. Wenn der Eingang vorhanden ist (9353), wird die Wortinformation ausgeschrieben (9359). In Fig. 84 an example of the processing of hyphenated words is shown, which is performed in the morphological analysis section 9016th For the English phrase entered, for example, "The anti-war attitude is her open-door policy". The cursor is incremented to take out a word ( 9135 ) and then performs dictionary retrieval ( 9353 ). In this case, the hyphen is not used as a delimitation for the word. If the input is present ( 9353 ), the word information is written out ( 9359 ).

Dies wird dann bis zum Ende des Satzes wiederholt.This is then repeated until the end of the sentence.

In dem Fall, daß der Eingang als ein Ergebnis der Wörterbuch- Referenz 9352 nicht vorhanden ist, falls es kein einen Bindestrich enthaltendes Wort gibt (9354) wird eine Wort-Information ausgeschrieben (9359), während, wenn es ein einen Bindestrich enthaltendes Wort gibt, wird ein Block mit dem Bindestrich geschrieben (9355). In dem Block mit Bindestrich ist die Startposition die Startposition für das mit Bindestrich geschriebene Wort, und die Endposition ist die Endposition für das mit Bindestrich geschriebene Wort. Das Ziel ist freigestellt, und die Rolle ist Adjektiv/Substantiv. Der Bindestrich wird dann herausgelöst, um jedes der Bestandteilworte herauszunehmen (9356) und die jeweiligen Bestandteil- Worte werden aus dem Speicher abgerufen (9357). Die Wortinformationen, welche als das Ergebnis des Wörterbuch-Abrufens erhalten worden ist, (9358) wird geschrieben. Wenn die Wortinformation bei den Schritten 9359 und 9358 ausgeschrieben wird, wird sie im Falle eines nicht im Wörterbuch registrierten Wortes als ein Sprachteil = ein nicht im Wörterbuch registriertes Wort ausgeschrieben.In the event that the input does not exist as a result of dictionary reference 9352 , if there is no hyphenated word ( 9354 ), word information is written out ( 9359 ), while if there is a hyphenated word , a block is written with the hyphen ( 9355 ). In the hyphenated block, the starting position is the starting position for the hyphenated word and the ending position is the ending position for the hyphenated word. The goal is optional and the role is adjective / noun. The hyphen is then released to take out each of the constituent words ( 9356 ) and the respective constituent words are retrieved from memory ( 9357 ). The word information obtained as the result of dictionary retrieval ( 9358 ) is written. When the word information is written out at steps 9359 and 9358 , in the case of a word not registered in the dictionary, it is written out as a speech part = a word not registered in the dictionary.

Fig. 85 zeigt Beispiele für die Blockinformation und die Wortinformation des englischen Blockes, der als ein Ergebnis der Verarbeitung des eingegebenen Satzbeispiels kollektiv zu einem Block zusammengestellt wird. In diesem Beispiel werden mit Bindestrich geschriebene Worte "anti-war" in dem Wörterbuch 9018 registriert, und die Worte "open-door" werden nicht in dem Wörterbuch registriert. Folglich wird der Eingang für das mit Bindestrich geschriebene Wort "anti-war" als die Information für das Wort ausgeschrieben. Für die mit Bindestrich geschriebenen Worte "open-door" werden sie als "open" und "door" zerlegt und als die Information für die Worte ausgeschrieben, und Block 1 (Start: 30, Ende: 38, Ziel: optional, Rolle: Adjektiv/Substantiv) wird als die Information für den Block ausgeschrieben. Obwohl die Form der englischen Zusatzfrage (tag question) äußerst begrenzt ist, ist deren Verarbeitung in dem üblichen Analyseverfahren sehr kompliziert. Ferner ist es nicht leicht, das Verb zu bestimmen, auf welches eine sogenannte Zusatzfrage bezogen ist. Nachdem erkannt wird, daß es eine sogenannte Zusatzfrage ist, welche auf dem Merkmal des Satzes im Hinblick auf dessen Form basiert, wird es als eine Zusatzfrage relativ zu der dazugehörigen strukturellen Zusammensetzung behandelt, wodurch das Verb das von der Zusatzfrage betroffen ist, spezifiziert werden kann. Das heißt, der Teil der Zusatzfrage in dem englischen Satz wird als ein strukturelles Muster herausgefunden, und es wird die Analyse durchgeführt, wobei der Teil der Zusatzfrage als reine Information mit einer bestimmten Art einer dazugehörigen strukturellen Zusammensetzung betrachtet wird. Fig. 85 shows examples of the block information and the word information of the English block which is collectively assembled into one block as a result of processing the input sentence example. In this example, hyphenated words "anti-war" are registered in dictionary 9018 , and the words "open-door" are not registered in the dictionary. Thus, the input for the hyphenated word "anti-war" is written out as the information for the word. For the hyphenated words "open-door" they are broken down as "open" and "door" and written out as the information for the words, and block 1 (start: 30, end: 38, goal: optional, role: adjective / Noun) is written out as the information for the block. Although the form of the English question (tag question) is extremely limited, its processing in the usual analysis procedure is very complicated. Furthermore, it is not easy to determine the verb to which a so-called additional question relates. After recognizing that it is a so-called additional question based on the feature of the sentence in terms of its form, it is treated as an additional question relative to the associated structural composition, whereby the verb affected by the additional question can be specified . That is, the part of the supplementary question in the English sentence is found out as a structural pattern and the analysis is carried out, the part of the supplementary question being regarded as pure information with a certain type of associated structural composition.

In der vorliegenden Ausführungsform wird eine Einheit oder ein Block in Form eines Symbols beschrieben (eine Ausgangspunktanzeige, die anzeigt, daß dies ein Einheiten- oder Block-Endpunkt ist). In der morphologischen Analyse wird der eingegebene Satztext geformt, in welchem auch die Erkennung für den Block durchgeführt wird. In der vorliegenden Ausführungsform ist "ein Anführungszeichen" als "Q" bezeichnet, und eine Klammer wird als "P" bezeichnet. Beispielsweise wird dies, wie nachstehend angeführt, festgesetzt:In the present embodiment, a unit or described a block in the form of a symbol (a starting point display, which indicates that this is a unit or Block end point is). In the morphological analysis the entered sentence text shaped, in which also the recognition is carried out for the block. In the present embodiment "a quotation mark" is referred to as "Q" and a bracket is referred to as "P". For example this is determined as follows:

′. . . .′durch (Q′. . . .)′, ". . . ."durch (Q". . . .)", (. . . .)durch ((P. . . .)), <. . . .<durch < (P. . . .) <, {. . . .}durch {(P. . . .)} und [. . . .]durch [(P. . . .)] bzw.'. . . .'By (Q '....)', "...." by (Q "....)", (....) by ((P....)), <. . . . <by <(P....) <, {. . . .} by {(P....)} and [. . . .] by [(P....)] or

Die Blockerkennung wird auf die gleiche Weise durchgeführt.The block detection is carried out in the same way.

Das Start- und das Endsymbol des Blockes werden nur unter dem Kontext angewendet, wo der Block durch diese Symbole geöf 39104 00070 552 001000280000000200012000285913899300040 0002003733674 00004 38985fnet oder geschlossen wird. Der Teil unmittelbar vor dem Startsignal und unmittelbar nach dem Endsignal sollten andere als alphanumerische Symbole sein. Die vorstehenden Symbole, die dem nicht entsprechen, werden als reine Symbole behandelt. Die Blöcke können mehrmals ineinandergesetzt sein, vorausgesetzt, daß sie sich nicht kreuzen oder überschneiden.The start and end symbols of the block are only under applied to the context where the block is opened by these symbols 39104 00070 552 001000280000000200012000285913899300040 0002003733674 00004 38985 or is closed. The part immediately before Start signal and immediately after the end signal should be others as alphanumeric symbols. The symbols above, those that do not correspond to this are treated as pure symbols. The blocks can be nested several times provided that they do not cross or overlap.

Wenn beim Verarbeiten der Zusatzfrage (tag question) die folgenden Wortgruppen in dem Augenblick nachfolgen, wo der Zeiger "," anzeigt, wird das Gebilde (cluster) nach "," bis "?" als eine Einheit gelöscht und ein Flag wird als ein Block gesetzt. Das heißt, die Form des Zusatzfragen-Satzes schließt ein:If when processing the additional question (tag question) the follow the following phrases at the moment when the Pointer "," indicates the structure (cluster) after "," to "?" cleared as a unit and a flag is considered a Block set. That is, the form of the set of additional questions includes:

", (Hilfsverb) + (Personalpronomen)?"
", (Hilfsverb) n′t + (Personalpronomen)?"
", (Hilfsverb) + (Personalpronomen) + nicht?"", (Auxiliary verb) + (personal pronoun)?"
", (Auxiliary verb) n′t + (personal pronoun)?"
", (Auxiliary verb) + (personal pronoun) + not?"

Ferner schließen die verschiedenen Hilfsverbarten folgendes ein: am, is, are, was, were, do, does, did, have, has, had, will, shall, would, should, can, cannot, could, may, might, must, ought, won′t, shan′t, need, dare, used. Die Arten des Personalpronomens schließen I, you, he, she, it, we, they ein.Furthermore, the various auxiliary agreements include the following a: am, is, are, were, were, do, does, did, have, has, had, will, shall, would, should, can, cannot, could, may, might, must, ought, won′t, shan′t, need, dare, used. The species of the personal pronoun I, you, he, she, it, we, they a.

Diese werden als die Information in der innersten Blockschicht, zu der sie gehören, verwendet. Beispielsweise wird in einem englischen Satz: you said so, didn′t you?, der ganze Teil als ein Block im Hinblick auf die strukturelle Zusammensetzung in [You said so,] < mit Zusatzfrage <. Ähnlich wird in dem englischen Satz: I said, "You said so didn′t you?", der zitierte Satz "You said so didn′t you?" als ein Block 1 im Hinblick auf die strukturelle Zusammensetzung erkannt, und ferner wird der ganze Teil als ein Block 2 im Hinblick auf die strukturelle Anordnung erkannt. Das heißt, [I said, [You said so,] < versehen mit einer Zusatzfrage <. This is called the information in the innermost block layer, to which they belong. For example in an English sentence: you said so, didn't you ?, the whole part as a block in terms of structural composition in [You said so,] <with additional question <. Similarly, in the English sentence: I said, "You said so didn't you? ", the quoted sentence" You said so didn't you? " as a block 1 in terms of structural composition recognized, and further the whole part becomes one Block 2 recognized with regard to the structural arrangement. That is, [I said, [You said so,] <provided with an additional question <.

Das Abkürzungswort wie "didn′t" wird behandelt, nachdem es entsprechend einer vorherbestimmten Tabelle in eine vollständig ausgeschriebene Form entwickelt worden ist. Für das Wort, das eine Anzahl entwickelter Formen hat, werden diese alle ausgegeben.The abbreviation like "didn't" is treated after it according to a predetermined table into a complete advertised form has been developed. For the Word that has a number of developed forms becomes these all spent.

Um eine derartige Verarbeitung für eine sogenannte Zusatzfrage durchzuführen, ist ein entsprechender Abschnitt 9210 zwischen dem morphologischen Analyseabschnitt 9016 und dem syntaktischen Analyseabschnitt I 9020 in einer anderen Modifikation der Erfindung angeordnet. Fig. 87 stellt die betreffenden Abschnitte dar. In diesen Figuren sind die Elemente, welche mit den in Fig. 64 identisch sind, mit denselben Bezugszeichen bezeichnet.In order to carry out such processing for a so-called additional question, a corresponding section 9210 is arranged between the morphological analysis section 9016 and the syntactical analysis section I 9020 in another modification of the invention. Fig. 87 shows the respective sections. In these figures, the elements which are identical to those in Fig. 64 are given the same reference numerals.

Der Abschnitt 9210 erhält das Ergebnis der morphologischen Analyse zusammen mit dem eingegebenen englischen Satz von dem hierfür vorgesehenen Abschnitt 9016, und wie in Fig. 88 dargestellt, wird ein Block für den Satz angeordnet. In den in der Figur dargestellten Beispielen ist der Block 0 (Starten: 1, Ende: 12, Ziel : Satz, Rolle : Rolle). In diesem Fall wird in dieser modifizierten Ausführungsform das Wort durch die Zahl für das Wort dargestellt. In dieser modifizierten Ausführungsform enthält der Block beispielsweise einen Satz zusätzlich zu einer Klausel und einer Gruppe. In diesem Fall schließt der Begriff des Blocks auch einen Absatz und den ganzen Satz ein, die jeweils als ein Block betrachtet werden können.Section 9210 receives the result of the morphological analysis together with the English sentence entered from section 9016 provided for this purpose, and as shown in FIG. 88, a block is arranged for the sentence. In the examples shown in the figure, block is 0 (start: 1, end: 12, goal: sentence, role: role). In this case, in this modified embodiment, the word is represented by the number for the word. For example, in this modified embodiment, the block contains a sentence in addition to a clause and a group. In this case, the term block also includes a paragraph and the whole sentence, each of which can be considered a block.

Die kollektive Zusammenstellung des Blocks für den eingegebenen englischen Satz einschließlich der Zusatzfrage kann dieselbe sein wie die in dem vorstehend beschriebenen, und in Fig. 81 dargestellten Flußdiagramm. Das heißt, die Verarbeitung 9300 zum Vorbereiten des Blockes des Satzes wird vor dem Beginn der Verarbeitung durchgeführt. Beispielsweise wird in einem englischen Satz I said, "It is good, isn′t it?" Block 0 gebildet (Start: Anfang des Satzes, Ende: Ende des Satzes, Rolle : Satz, Ziel : Satz). In dem syntaktischen Analyseabschnitt I 9020 wird eine Analyse mittels desselben Flußdiagramms durchgeführt, wie es in Fig. 82 dargestellt ist. Ein Verarbeiten in dem die Zusatzfrage verarbeitenden Abschnitt 9210 wird anhand von Fig. 90A und 90B erläutert. Zuerst wird ein Zeiger auf das Wort an dem oberen Ende der Wortinformation gesetzt (9370). Wenn es kein Komma ist, wird der Zeiger schrittweise vorgerückt (9384). Dies wird bis zum Ende des Satzes wiederholt (9371). Dann wird geprüft, ob das Wort, das dem Komma am nächsten ist, ein Wort ist, das zu der α-Gruppe gehört, oder ein Wort, das zu der β-Gruppe gehört, während der Zeiger an der Stelle gelassen wird, an der er ist (9373, 9379). In diesem Fall wird dann bestimmt, daß das, was zu einem Hilfsverb gehört, oder ein Verb sein in dem Sprachteil ist und nicht in negativer Form vorliegt, Worte sind, die zu der α-Gruppe gehören, während das, was die negative Form des Hilfsverbs einschließt oder die negative Form eines Verbs sein in dem Sprachteil Worte sind, die zu der β-Gruppe gehören. Wenn das Wort zu keiner der Gruppen gehört, wird der Zeiger schrittweise vorgerückt (9484) und die Vorgänge werden bis zum Ende des Satzes wiederholt (9371).The collective composition of the block for the input English sentence including the supplementary question may be the same as that in the flowchart described above and shown in Fig. 81. That is, processing 9300 to prepare the block of the sentence is performed before the start of processing. For example, in an English sentence I said, "It is good, isn't it?" Block 0 formed (start: beginning of the sentence, end: end of the sentence, role: sentence, goal: sentence). In the syntactic analysis section I 9020 , an analysis is performed using the same flow chart as shown in FIG. 82. Processing in the supplemental question processing section 9210 will be explained with reference to FIGS. 90A and 90B. First, a pointer to the word at the top of the word information is set ( 9370 ). If it is not a comma, the pointer is advanced step by step ( 9384 ). This is repeated until the end of the sentence ( 9371 ). It is then checked whether the word closest to the comma is a word belonging to the α group or a word belonging to the β group while the pointer is left where he is ( 9373, 9379 ). In this case, it is then determined that what belongs to an auxiliary verb, or is a verb in the language part and is not in negative form, is words belonging to the α group, while what is the negative form of the Includes auxiliary verbs or be the negative form of a verb in which the language part is words belonging to the β group. If the word does not belong to any of the groups, the pointer is advanced ( 9484 ) and the operations repeated until the end of the sentence ( 9371 ).

Für den Fall, daß das Wort zu der α-Gruppe gehört, wird der Schritt 9384, bei welchem der Zeiger vorrückt, durchgeführt, wenn das Wort, das den Worten der a-Gruppe am nächsten ist, kein Pronomen ist. Wenn es ein Pronomen ist, wird geprüft, ob das nächste Wort "nicht" ist oder nicht (9375) und wenn es nicht "nicht" ist, wird geprüft, ob das Wort, das Pronomen am nächsten ist, ein Fragezeichen ist oder nicht (9377). Wenn es kein Fragezeichen ist, wird der Schritt 9384 durchgeführt. Wenn es das Fragezeichen ist, wird das Ziel wieder in einen "negativen Satz" und die Rolle in den "Zusatzfragen- Satz" für den innersten Schichtblock geschrieben (9378) und das ",. . .?" wird aus der Informationstabelle für das Wort gelöscht (9383). Der innerste Schichtblock bezeichnet solche Blöcke, welche den Bedingungen genügen: Startposition ≦ (Position für ",") und die auch der Bedingung genügen:
Endposition ≧ (Position für "?") für die Blockposition und mit der minimalen (Endposition . . . Startposition).In the event that the word belongs to the α group, step 9384 , in which the pointer advances, is performed if the word closest to the words of the a group is not a pronoun. If it is a pronoun, it is checked whether the next word is "not" or not ( 9375 ) and if it is not "not", it is checked whether the word closest to the pronoun is a question mark or not ( 9377 ). If it is not a question mark, step 9384 is performed. If it is the question mark, the goal is written again in a "negative sentence" and the role in the "additional question sentence" for the innermost layer block ( 9378 ) and the ", .. ?" is deleted from the word information table ( 9383 ). The innermost layer block denotes blocks that meet the conditions: start position ≦ (position for ",") and which also meet the condition:
End position ≧ (position for "?") For the block position and with the minimum (end position... Start position).

Wenn das Wort, das dem Pronomen am nächsten steht, bei dem Schritt 9375 "nicht" ist, wird geprüft, ob das Wort, das dem "nicht" am nächsten ist, ein Fragezeichen ist oder nicht (9376). Wenn es kein Fragezeichen ist, wird der Schritt 9384 ausgeführt, d. h. der Zeiger rückt vor. Wenn es das Fragezeichen ist, wird das Ziel für den innersten Schichtblock in einen "Betätigungssatz" geschrieben, während die Rolle in einen "Zusatzfragen-Satz" geschrieben wird (9382), und ",. . .?" wird aus der Wortinformationstabelle (9383) gelöscht.If the word closest to the pronoun is "not" at step 9375 , it is checked whether the word closest to the "not" is a question mark or not ( 9376 ). If it is not a question mark, step 9384 is performed, that is, the pointer advances. If it is the question mark, the target for the innermost shift block is written in an "actuation sentence" while the role is written in an "additional question sentence" ( 9382 ), and ", .. ?" is deleted from the word information table ( 9383 ).

Wenn beim Schritt 9379 das Wort, das dem Komma am nächsten ist, ein Wort ist, das zu der β-Gruppe gehört, wird der Schritt 9384 durchgeführt, wenn das Wort, das der β-Gruppe am nächsten ist, kein Pronomen ist. Wenn es ein Pronomen ist, wird geprüft, ob das Wort, das ihm am nächsten steht, ein Fragezeichen ist (9381) oder nicht, und es wird in den Schritt 9384 eingetreten, wenn es kein Fragezeichen ist. Wenn es das Fragezeichen ist, wird das Ziel des innersten Schichtblocks in einen "Bestätigungssatz" geschrieben, während die Rolle in eine "Zusatzfragen-Markierung" geschrieben wird (9382), und ", . . .?" wird aus der Wortinformationstabelle gelöscht (9383). Dann wird der Zeiger schrittweise vorgerückt (9384), und die Vorgänge werden bis zum Ende des Satzes wiederholt.At step 9379 , if the word closest to the comma is a word belonging to the β group, step 9384 is performed if the word closest to the β group is not a pronoun. If it is a pronoun, a check is made to see if the word closest to it is a question mark ( 9381 ) or not, and step 9384 is entered if it is not a question mark. If it is the question mark, the target of the innermost layer block is written in a "confirmation sentence " while the role is written in a " supplementary question mark" ( 9382 ), and ", .. ?" is deleted from the word information table ( 9383 ). The pointer is then incremented ( 9384 ) and the operations repeated until the end of the sentence.

Beispielsweise zeigt Fig. 88 die Information für die Blöcke und die Worte, welche von dem morphologischen Analyseabschnitt 9016 an dem die Zusatzfrage verarbeitenden Abschnitt 9210 für den englischen Satz: I said, "It is good, isn′t it?" erhalten worden ist. Die Blockinformation für den Block 1 ist (Start: 4, Ende: 12, Ziel: optional, Rolle: optional). Wenn dies der Zusatzfragen-Verarbeitung in dem Abschnitt 9210 unterworfen wird, wird die Blockinformation für den Block 1 wieder geschrieben in (Start: 4, Ende: 12, Ziel: Bestätigungssatz, Rolle: Zusatzfragensatz) und gleichzeitig wird die Wortinformation, welche die Zusatzfrage #8. . .#11 betrifft, gelöscht. For example, Fig. 88 shows the information for the blocks and the words which are passed from the morphological analysis section 9016 to the supplementary question processing section 9210 for the English sentence: I said, "It is good, isn't it?" has been obtained. The block information for block 1 is (start: 4, end: 12, destination: optional, role: optional). If this is subjected to supplementary question processing in section 9210 , the block information for block 1 is rewritten in (start: 4, end: 12, destination: confirmation sentence, role: supplementary question sentence) and at the same time the word information which the supplementary question # 8th. . # 11 concerns, deleted.

To Fig. 1
1010 input section
1018 word dictionary
1036 Analysis rules file
1100 input device
1102 document file
1104 input interface
1106 processing section
1108 Dictionary fetch section
1110 Contrast elimination processing section
1112 control section
1116 Analysis input file
1118 Analysis dictionary information file
1016 Morphological analysis section
To Fig. 2
1010 input section
1012 English text
1014 Pre-Editing Section
1016 Morphological analysis section
1018 word dictionary
1020 Analysis Section I
1022 Analysis section II
1024 structure conversion section
1026 translation section
1030 Reviewing Section
1032 output section
1034 Japanese sentence translated
1036 Analysis rules file
1038 control section
1040 operation display section
To Fig. 4
Enter 1200 process
1201 split
1202 end position?
1203 retrieve
1204 Eliminate opposition
1205 edition
To Fig. 5
1210 input
1211 reshaping
To FIGS. 8A to 8D
1221 P → highest flag equal to 1?
1223 PSAVE → end position greater than or equal to P → start position?
1225 PSAVE → language part equal to verb?
1226 P → language part equal to verb?
1227 P → language part equal to verb?
1231 P → effective input 0
1232 PSAVE → effective input 0
1235 end position?
To Fig. 11
2010 input device
2012 Template file entered
2014 input section
2014 a buffer for entered string
2016 processing section
2016 a dictionary information preservation table
2018 output interface
2020 dictionary retrieval section
2022 dictionary file
2024 unit detection section
2026 basic unit dictionary file
2028 control section
To Fig. 15
Read 2100 sentences
2102 Split
2104 Is there a string?
2106 retrieve
2108 Is there an entrance?
2110 Preserve dictionary information
2112 unit detection
2114 edition
To Fig. 16
Place 2200 pointer P at the top of the string
2202 Query basic unit dictionary
2204 Is there an entrance?
2206 Advance to P.
2208 Does the string exist?
2210 Store unit information
2212 Preserve failed information
To Fig. 17
3014 pre-editing section
3100 input processing section
3102 unit splitting section
3104 Dictionary fetch section
3108 Delimitation table
3018 word dictionary
3110, 3112, 3114, 3116, 3120, 3122, 3020 processing section
3124 Dictionary information preservation table
To Fig. 18
3010 input section
3012 English text
3014 pre-editing section
3016 Morphological analysis section
3018 dictionary
3022 analysis section II
3024 structure conversion section
3026 translation section
3030 revision section
3032 output part
3034 Japanese sentence translated
3036 Analysis rules file
3038 control section
3040 operation display section
To FIGS. 19A and 19B
32 Enter process
3201 Split
3202 end position
3203 Search
3209 Collective compilation
3210 output
3204 Is there an entrance?
3205 Numeric flag equal to 1?
3206 numbering
3207 Collective compilation
3208 conservation
3212 with hyphen?
3213 Process hyphenated number
3214 currency symbol?
3216 preservation
3217 Delete currency symbol
3215 Process sequence of numeric characters
To Fig. 20
3220 initialization process
3221 numerical value?
3222 Is the previous one a currency symbol?
3223 Put together into a single noun
3224 Is the previous one?
3225 Put together into a single noun
3226 advance pointer
3227 end position
To FIGS. 21A and 21B
3230 Initialization process
3231 Split
3232 end position?
3233 Make the stored value a numerical value
3235 Get
3236 Is there an input?
3237 preservation
3238 Numeric flag equal to 1?
3239 numbering
3240 addition
3241 preservation
3207 Collective compilation
To FIGS. 22A and 22B
3250 initialization
3251 Is * P numeric?
3252 Is * P a notation?
3253 Is * P a decimal point?
3254 i equals i × 10
3255 preservation
3256 i equal to 1?
3257 valsave val.save * 10 + number (* P)
3258 val.save val.save + number (* P) / i
3259 Advance pointer P.
3260 end position?
3261 Make the numerical value as val - save
3207 Collective compilation
To FIGS. 23A and 23B
Set 3270 pointer to previous position
3271 Is there no word at the previous position?
3272 input is "and"
3273 Does the pointer display a numerical value?
3276 end position?
3277 Does the pointer show a numerical value?
Round off 3278 v -now as vl
3281 Clear "and" information
3282 P → End position End position of the memory reference unit
3284 preservation
To Fig. 30
4012 input device
4014 word processing section
4014 a buffer for entered string
4016 unit splitting section
4018 limitation table
4020 reference dictionary
4022 reference fetch section
4026 Section for processing the previous sentence end
4028 proper name processing section
4030 Section for processing a proper name in itself
4032 processing section
4034 Section for obtaining missing feature information
4036 processing section
4036 a Dictionary information preservation table
4038 analysis section
To Fig. 32
Enter 4100 process
4102 Split
4104 Are strings ended?
4106 Get
4108 Is there an entrance?
4110 proper name?
4112 preservation
4114 Processing for registered proper names
4116 capital letter?
4118 Record the proper name as an unregistered word
4120 Processing for an unregistered proper name
4122 Feature addition does not comply
4124 output
To Fig. 33
4200 Is the previous one an unregistered proper name?
4202 Record whole part with feature information itself
4204 Is the previous one a registered proper name?
4206 Is feature information of the previous one unknown?
4208 feature information per se is unknown
4210 Record the whole part with the feature information from the previous one
4212 Is there a common characteristic?
4214 Record the proper name with the common characteristic
4216 Record the proper name with the characteristic information itself
4218 Record
To Fig. 34
4300 Is the previous one a candidate for the end?
4302 Assess whether the previous is an end
4303 Convert large letters to small letters
4306 Is there an entrance?
4308 conservation
4310 Preservation as an unregistered proper name
4312 Judge whether the previous is not an end
4314 Assess whether it is a proper name in itself
4316 Processing the proper name
To Fig. 35
Place 4400 hands on the upper end
4402 proper names?
4404 Feature information is unknown
4406 addition of the incorrect feature information
4408 advance pointer
4410 end?
To Fig. 37
5014 pre-editing section
51 Input processing section
5102 unit splitting section
5104 Dictionary fetch section
5108 Delimitation table
5018 reference dictionary
5110 position information processing section
5112 Section for judging the previous sentence end
5114, 5116, 5118 proper name processing section
5124 Dictionary Information Preservation Table
5020 Morphological Analysis Section I
To Fig. 38
5010 input section
5012 English text
5014 pre-editing section
5016 Morphological analysis section
5018 word dictionary
5020 Syntactic analysis section
5022 Syntactic analysis section II
5024 structure conversion section
5026 translation section
5030 revision section
5032 output section
5034 Japanese sentence translated
5038 control section
5040 operation display section
To Fig. 40
Enter 5200 process
5201 Split
5202 end position?
5203 Get
5204 Is there an input?
5205 proper name?
5206 preservation
5207 Processing of registered proper names
5210 output
5212 capital letter?
5213 Processing of unregistered names
5214 Record the proper name as an unregistered name
To Fig. 41
5220 position information
5221 pattern 0
5222 pattern 1
5223 pattern 2 or 3
5230 Is the following an unregistered proper name?
5231 POS equals 1?
5232 POS equals 0?
5233 Record the whole part as POS equal to 1
5234 Record separately as POS equal to 0
5235 Record whole part as POS equal to 0
To Fig. 43
5240 Is the previous one an unregistered proper name?
5241 Convert the previous word to an unregistered one
5242 Record separately as POS equal to 1
5250 Is the previous one an unregistered proper name?
5251 POS equal to 1?
5252 POS equals 0?
5253 POS equal self
5254 "The"?
5255 As POS equals POS. . . Record self with previous word
5256 Record "3" from "the" itself as a POS
5257 Separately as POS equals POS. . . record self
To Fig. 45
5260 Is the preceding one a sentence end candidate?
5261 Judge whether the candidate is a sentence end
5262 Convert upper case letters to lower case letters
5263 Is there an entrance?
5264, 5265 Record
5221 Processing pattern 0
To Fig. 47
6012 input device
6014 input processing section
6014 a buffer for entered string
6016 unit splitting section
6018 delimitation table
6020 reference dictionary
6022 reference fetch section
6026 Section for processing the previous sentence end
6028 Section for processing the previous proper name
6030 Section for processing the proper name
6036 processing section
6036 a Dictionary information preservation table
6038 Syntactic analysis section
To Fig. 49
Enter 6100 process
6102 Split
6104 end?
6106 retrieve
6108 Is there an entrance?
6110 proper name?
6112 Record
6114 Processing the registered proper name
6116 Capital letter?
6118 Record the proper name as an unregistered word
6120 Processing the unregistered proper name
6122 output for FIG. 50
6200 Is the previous one an unregistered proper name?
6202 Record the whole part as a proper name with characteristic information
6204 proper name?
6206 Is feature information unknown?
6208 Is feature information itself unknown?
6210 Whole part as proper names with characteristic information of the previous record
6212 Record the proper name separately
6214 Record
To Fig. 51
6300 Is the previous one a sentence end candidate?
6302 Judge whether the previous one is an end
Convert 6304 upper case to lower case
6306 Is there an entrance?
6308 Record
6310 Record the proper name unchanged
6312 Judge whether the previous one is not an end
6314 Assess whether the noun itself is a proper name with unknown feature information
6316 Processing the registered proper name
To Fig. 53
7014 pre-editing section
7100 input processing section
7102 unit splitting section
7104 Delimitation table
7018 dictionary
7106 Dictionary fetch section
7108 Morpheme processing
7110 initial value setting section
7112 Adjustment Retrieval Section
7114 unit splitting section
7116 polling section
7118 Section for generating morpheme processing information
7120 polling section
7122, 7124 processing section
7126 Dictionary information preservation table
7020 Syntactic analysis section I
To Fig. 54
7010 input section
7012 English text
7014 pre-editing section
7016 Morphological analysis section
7018 dictionary
7020 Syntactic analysis section
7022 Syntactic analysis section II
7040 structure forming section
7026 translation section
7030 revision section
7032 output section
7034 Japanese sentence translated
7036 Analysis rules file
7038 control section
7040 operation display section
To Fig. 55a and 55b
7300 input
7302 compartments
7304 Finished?
7306 Get
7308 Addition of morpheme processing information
7310 Provide information with morpheme processing?
7312 record
7314 collective processing
7410 initial setting
7412 Adjust by (p + n) th unit
7414 adapted?
Increase 7416 n by 1
7418 input buffer end?
7420 retrieval by (p + n)
7422 addition of morpheme processing information
7424 n <1
7426 adjustment
7428 adapted?
7430 accumulation of P on (P + n -1)
Record 7434 P separately
To Fig. 58
8020 input processing section
8021 unit splitting section
8023 limitation table
8024 reference table
8025 Dictionary information preservation table
8026 dictionary retrieving section
To Fig. 59
8030 adjustment section
8031 prefix dictionary
8024 reference dictionary
8032 Dictionary fetch section
8033 adjustment section
8034 Dictionary information preparation section
8035 Derivative processing section
8025 Dictionary information preservation table
To Fig. 60
8040 adjustment section
8041 Sufix dictionary
8042 processing section
8043 Dictionary fetch section
8044 adjustment section
8045 Section for processing dictionary information preparation
8046 dictionary retrieving section
8047 word processing section
8025 Dictionary information preservation table
To Fig. 61
8020 input processing section
8021 unit splitting section
8022 demarcation table
8024 reference dictionary
8025 Dictionary information preservation table
8030 adjustment section
8031 prefix dictionary
8032 Dictionary fetch section
8033 adjustment section
8034 Dictionary information preparation section
8040 adjustment section
8041 Sufix dictionary
8042 processing section
8043 Dictionary fetch section
8044 adjustment section
8045 Section for processing dictionary information processing
8040 adjustment section
8047 word processing section
To Fig. 62
8020 input processing section
8021 unit splitting section
8022 demarcation table
8024 reference dictionary
8025 Dictionary information preservation table
8030, 8033 adjustment section
8031 prefix dictionary
8032 Dictionary fetch section
8034 Dictionary information preparation section
8040 adjustment section
8041 Sufix dictionary
8042, 8046 dictionary retrieving section
8044 adjustment section
8045 Section for processing dictionary information preparation
8047 word processing section
8050 Processing section for rated information
8051 preparation
To Fig. 63
8001 input section
8002 English text
8003 pre-editing section
8004 Morphological analysis section
8005 Syntactic analysis section I
8006 Syntactic analysis section II
8007 operation display section
8008 word dictionary
8009 Analysis rule file
8010 control section
8011 structure conversion section
8012 Section for creating a translated sentence
8013 revision section
8014 output section
8015 Japanese sentence
To Fig. 64
9010 input section
9012 English text
9014 pre-editing section
9016 Morphological analysis section
9018 word dictionary
9020 analysis section I
9022 analysis section II
9024 structure conversion section
9026 Section for creating a translation
9030 revision section
9032 output section
9034 Translated Japanese sentence
9036 Analysis rules file
9038 control section
9040 operation display section
To Fig. 65
9014 pre-editing section
9016 Morphological analysis section
9018 word dictionary
9020 analysis section I
9036 Analysis rule file
To Fig. 66
Put 9100 position at the top
Take out 9101 word
9102 Adjusted with upper condition?
9103 Take out the necessary number of words
9104 Adapted to the above condition P ?
9105 Search
9106 Does the end condition exist?
9107 Register block
To Fig. 67
9110 Has sentence ended?
9111 Advance position by 1
9112 retrieve
9113 Write word information
To Fig. 73
9120 Assessment of goal and role
9121 Is there a block?
9122 Analyze the innermost block
9123 Select
9124 Treat what has been selected as a single block
9125 Analyze sentence
9126 Select
To Fig. 76
9010 input section
9012 English text
9014 pre-editing section
9016 Morphological analysis section
9018 word dictionary
9020 analysis section I
9022 analysis section II
9024 structure conversion section
9026 translation section
9030 revision section
9032 output section
9034 Translated Japanese sentence
9036 Analysis rules file
9038 control section
9040 operation display section
9200 processing section for information let
To Fig. 77
9014 pre-editing section
9016 Morphological analysis section
9020 analysis section I
9018 word dictionary
9036 Analysis rules file
9200 processing section for information let
To Fig. 81
9300 Prepare a block
Put 9100 position at the top
9101 Process for taking out a word
9102 Adapted to the above condition?
9103 Take out the necessary number of words
9104 Adapted to the above condition?
9105 Search
9106 Does the end condition exist?
9107 Advertise block
To Fig. 82
9120 Assessment of goal and role
9121 Is there a block?
9122 Analyze innermost block
9123 Select
9124 Treat what is selected as a single block
To FIG. 83A and 83B
9330 Place pointer on top of word information
9331 Is information let 0?
9332 Is the language part a punctuation mark?
9333 Mark innermost block
9334 Information let
9335 Recognize the role as a requirement record
9336 Recognize the role as an invitation phrase
9337 Recognize the target as a command set
9338 Delete word information
9339 Advance pointer by 1
9330 end?
To Fig. 84
9350 end of sentence?
9351 Advance position by 1
9352 Get
9353 Is there an entrance?
9354 With hyphen?
9355 Announcement of the block
9356 Remove the hyphen
9357 Get
9358 Writing out word information
9359 Writing out word information
To Fig. 86
9010 input section
9012 English text
9014 pre-editing section
9016 Morphological analysis section
9018 word dictionary
9020 analysis section I
9022 analysis section II
9024 structure conversion section
9026 translation section
9030 revision section
9032 output section
9034 Translated Japanese sentence
9036 Analysis rules file
9038 control section
9040 operation display section
9210 Processing section for additional question
To Fig. 87
9014 pre-editing section
9016 Morphological analysis section
9018 word dictionary
9020 analysis section I
9036 Analysis rules file
9210 Processing section for additional question
To FIG. 90A and 90B
Place 9370 pointer at the top
9371 end?
9372 comma?
9373 group
9374 Proper name?
9375 "not"
9376 Question marked?
9377 Question marked?
9378 Write the goal for negative sentence and the role for additional question
9376 β group?
9380 proper name?
9381 Question marked?
9382 Write the destination for the confirmation sentence and the role for the additional question
9383 Delete "question mark"
9384 Advance pointer by 1

Claims

1. Language analyzer, characterized by a dictionary device with dictionary data stored therein, including morpheme data for words, compound words and phrases, and
analysis means for performing a morphological analysis on an input sentence with reference to the dictionary means, wherein,
the dictionary device contains data for the degree of coupling, which represents the degree of coupling between the respective words which form the compound words or phrases, and
the analyzer refers to the dictionary device with respect to the respective words contained in the input sentence and when retrieving a number of dictionary data for one word in combination with other words, the combination of words with a higher degree of coupling by referring to the data for the Select the degree of coupling.

2. Speech analyzer according to claim 1, characterized in that that if the data for the degree of coupling a high degree of coupling for a number of combinations between one retrieved in the dictionary facility Word and other words indicate the analyzer Preferred dates prepared to follow a certain rule to give preference to one of the number of combinations.

3. Language analyzer, characterized by an input device for entering a character string of a predetermined language; a basic dictionary device used for retrieving the input string and having basic data stored therein, and
an analysis device for grammatical analysis of the input character string by retrieving the basic dictionary device, wherein
the analyzer retrieves the basic dictionary device for the input string and when thereby retrieving part of the string, retrieves the basic dictionary device for other parts in the string in the same manner to thereby morphologically analyze the string.

4. speech analyzer according to claim 3, characterized in that that the basic dictionary facility a Dictionary setup for basic units with data is to express dimensional units stored in it, and that the analysis means the dictionary means with the basic units for the entered string calls, to thereby morphologically analyze whether the String expresses a unit of measure or not.

5. Speech analyzer according to claim 4, characterized in that that the analysis device the dictionary device with basic units for the string gets, and if the string just from the combination of strings that exist in the dictionary facility stored with basic units, which is a unit as a result of retrieving expresses it as one String judged to express the unit.

6. speech analyzer according to one of claims 3 to 5, characterized in that the analysis device has a pointer that the pointer to a character is placed at the top of the entered string, and if part of the string when retrieving the basic dictionary device for the string is retrieved which starts from the character at which the pointer has been set, the pointer is set to a string that is on the part of the retrieved string follows, and that the basic dictionary device further for the subsequent string to which the Pointer is set.

7. speech analyzer according to one of claims 3 to 6, characterized in that the entered String accessed by a common dictionary facility and not stored in the dictionary facility is.

8. Language analyzer, characterized by a dictionary device with dictionary data stored therein at respective dictionary reference units, and an analysis device to subdivide the input sentence into dictionary reference units and to carry out a morphological analysis for the dictionary reference units Dictionary facility is referred to
the dictionary means includes, as the dictionary data, a distinctive display indicating that the dictionary reference units represent numbers for the number representative dictionary reference units and
the analysis means refers to the dictionary means for the respective dictionary reference units contained in the input sentence, and
when the distinctive display is included in the retrieved dictionary data, the dictionary reference unit from which the distinctive display has been retrieved together with a dictionary reference unit which is close to the dictionary reference unit and from which the other distinguishing display is retrieved is numerical Values which are meant by the two dictionary reference units are calculated together to form a single numerical value and the dictionary reference units are formed into a single analysis unit.

9. Speech analyzer according to claim 8, characterized in that that the analysis device the analysis unit, when accompanying a dictionary reference unit, which expresses a currency symbol or a unit of dimension, together with the numerical value into one Analysis unit composed.

10. language analyzer, characterized by an input device for entering a character string of a predetermined language;
a dictionary device for retrieving the input character string from the input device;
a retriever for retrieving the dictionary device for the input string and
a feature information creating means for generating a feature information of a character string not recorded in the dictionary device and a character string whose feature information is not recorded in the dictionary device among the input character strings as a result of the retrieval by the retrieving device, the feature information creating device generates a number of feature information for the string that has no feature information.

11. A speech analyzer according to claim 10, characterized in that that the analyzer further comprises a unit subdivision device to the input String in units to get the dictionary setup split, and an analyzer to track the String, which by the unit splitter is split after retrieving the string the retriever along with the previous string analyze.

12. Speech analyzer according to claim 11, characterized in that that if feature information is either in one string or in the previous String exists and if there is no feature information is present in the other string, the analyzer the string that has no feature information has, with the feature information of a string provided.

13. A speech analyzer according to claim 11, characterized in that that if character information in each string and is in the previous string, the analysis device has a common feature information as the feature information for this string and for returns the previous string.

14. Speech analyzer according to one of claims 2 to 4, characterized characterized that the string and the previous string of characters passed through the analyzer have been morphologically analyzed, proper names are.

15. A speech analyzer according to claim 11, characterized in that that if the string of one big written characters starts from and the previous one Follow the analysis facility at the end of the sentence the uppercase letter at the beginning of the string converted to a lower case letter, and then the dictionary device by means of the retrieval device retrieves, and if it is not in the dictionary facility as a result of the retrieval process is registered, the string analyzed as an unregistered proper name.

16. language analyzer, characterized by a dictionary device which has dictionary data stored therein for the respective dictionary reference units, and
an analyzer for dividing an input sentence into dictionary reference units and for performing a morphological analysis with reference to the dictionary unit for the dictionary reference units, wherein
the dictionary device contains, as the dictionary data, distinctive information for determining the position at which a number of proper names are permitted in a subsequent proper name set for a dictionary retrieval unit that means a proper name, and the analysis device on the dictionary device for the respective dictionary -References referenced in the input sentence, and when the distinction information is contained in the retrieved dictionary data, the dictionary reference unit from which the discrimination information was retrieved together with the dictionary reference unit which is adjacent to that dictionary reference unit and means another proper name, combined into a single analyzer according to the position determined by the discrimination information.

17. Language analyzer, characterized by an input device for entering a character string of a predetermined language;
a dictionary device for retrieving the character string input to the input device and a feature information analyzing device which retrieves the dictionary device for the input character string and then analyzes the feature information of the string, the feature information analyzing device analyzing the feature information of the string while pre-analyzing the feature information of the string and considering that string.

18. A speech analyzer according to claim 17, characterized in that that the feature information analyzer the feature information of a number of strings by arranging and assembling the number collectively Strings parsed.

19. A speech analyzer according to claim 17, characterized in that that if the feature information in this a string or in one of these strings preceding string is present and if the feature information does not exist in the other string the feature information analyzer is the string, that has no feature information with the feature information that supplies a sequence of characteristics.

20. Speech analyzer according to one of claims 17 to 19, characterized in that the string, which by the feature information analyzer has been analyzed, is a proper name.

21. Language analyzer, characterized by a dictionary device with dictionary data stored therein for each of the dictionary reference units, and
analysis means for dividing an input sentence into dictionary reference units and for performing a morphological analysis for the dictionary reference units by reference to the dictionary means, the analysis means distinguishing that a sequence of dictionary reference units with a particular semantic element is a composite Is a unit that expresses a very specific meaning, which is formed according to a certain rule and forms a sequence of dictionary reference units with the specific semantic element as a single analysis unit.

22. A speech analyzer according to claim 22, characterized in that
the dictionary device contains data to distinguish the dictionary reference unit with a very specific semantic element;
the analyzer has a corresponding distinction table to distinguish that a sequence of dictionary reference units with the specific semantic element is a composite unit which expresses a very specific meaning, which is composed according to a specific rule, and
the analyzer which retrieves dictionary means with respect to the respective dictionary reference units contained in the input sentence and forms the sequence of dictionary reference units with the specific semantic element into a single analysis unit, corresponding to the corresponding distinction table, if from the dictionary reference unit has been distinguished with the specific semantic element.

23. speech analyzer, characterized in that a grammatical feature, a semantic feature or a translated word as one not in a dictionary registered word is judged as a derivative with regard to the morphemic feature, such as a Affix is recognized.

24. Language analyzer, characterized by a first analysis device for performing a morphological analysis for an input sentence of a predetermined language;
second analysis means for performing an analysis for the sentence of the speech based on the result of the morphological analysis of the first analysis unit; a dictionary device with dictionary data of the language stored therein, which was used for the analysis by means of the first and the second analysis device, and
carry out a control device for retrieving the dictionary device and thus the corresponding analyzes of the first and second analysis devices, wherein
the first analyzing device queries the dictionary device, distinguishing the structural composition in that the feature of the input sentence of the language is judged in terms of the form thereof and judges a position that is a result of the analysis of the structural role of this compilation which is in the sentence for this compilation functions accordingly, and
the second analyzer analyzes the surface layer structure of the sentence of speech by applying a grammar rule based on the position and role evaluated and analyzing the possible interrelation of the components contained in the sentence.

25. A speech analyzer according to claim 24, characterized in that that the second analyzer is analyzing preferred for others performs when compiling in the sentence of language is included.

26. Speech analyzer according to claim 24, characterized in that that the first analyzer has the feature the sentence of language in terms of form and its Differentiates meaning and evaluates the apposition expression, which is based on the different features.

27. Language analyzer according to claim 24, characterized in that the predetermined language is English and
the dictionary device contains, as dictionary data, distinctive information distinguishing predetermined words containing "let's" and "let us", and
the first analyzer when it retrieves the dictionary to obtain the discrimination information excludes "let's" from the subject of the analysis in the second analyzer and "let us" as the subject of the analysis in the second analyzer if punctuation on the previous part is available.

28. A speech analyzer according to claim 27, characterized in that that the first analysis device the position as a command and role as an invitation to the Compilation which evaluates the excluded part contains.

29. A speech analyzer according to claim 24, characterized in that that the first analyzer if no dictionary data is included when the dictionary setup for a number of hyphenated words the dictionary facility is retrieved for each of the number Word retrieves the whole part of the number of words as one Compilation evaluated and the position of the compilation rated as an adjective group.

30. Language analyzer according to claim 24, characterized in that the predetermined language is English, and
the first analyzer distinguishes the feature of the English sentence in terms of its form, the supplementary question group distinguishes based on the differentiated feature, evaluates the whole part of the supplementary question group as a compilation whose position is a supplementary question group, and excludes it from the subject of the analysis in the second analysis device.