DE3733674C2

DE3733674C2 -

Info

Publication number: DE3733674C2
Application number: DE3733674A
Authority: DE
Inventors: Toshihiko Yokohama Kanagawa Jp Yokogawa
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1986-10-03
Filing date: 1987-10-05
Publication date: 1992-02-06
Also published as: DE3733674A1; FR2604814B1; NL8702359A; FR2604814A1

Description

Die Erfindung betrifft einen Sprachanalysator. Er ist zum Analysieren von natürlicher Sprache geeignet und kann insbesondere in Verbindung mit einer automatischen Übersetzungseinrichtung verwendet werden.The invention relates to a speech analyzer. It is suitable for analyzing natural language and in particular in conjunction with an automatic Translation device can be used.

Aus der Literaturstelle WICKERT, Klaus "Automatische Spracheingabe und Sprachausgabe", Haar bei München, Verlag Mark & Technik 1983, Seiten 372-390, ist bereits ein Sprachanalysator mit einer Wörterbucheinrichtung bekannt, welche morphologische Daten einer bestimmten Sprache ent hält. Bei einem Ausführungsbeispiel dieses bekannten Sprachanalysators gelangt eine erste Analyseeinrichtung zur Anwendung, um eine morphologische Analyse einer eingegebe nen Zeichenfolge (Wortfolge) in einer bestimmten Sprache durchzuführen. Dieser ersten Analyseeinrichtung ist eine zweite Analyseeinrichtung nachgeschaltet, in der eine syntaktische Analyse der eingegebenen Zeichenfolge (Wort folge) auf der Grundlage des Analyseergebnisses der ersten Analyseeinrichtung durchgeführt wird. Bei einem derartigen Sprachanalysator müssen sehr große Datenmengen in komplexen Datenstrukturen übertragen und weitergeleitet werden, so daß unter anderem unhandlich große Wort-Bäume entstehen und darüber hinaus auch Fehlentscheidungen in den unteren Ver arbeitungsstufen auftreten können, die hinterher nicht mehr korrigiert werden können.From the reference WICKERT, Klaus "Automatic Voice input and voice output ", Haar bei München, Verlag Mark & Technik 1983, pages 372-390, is already a Voice analyzer with a dictionary device known, which morphological data ent holds. In one embodiment of this known Voice analyzer arrives at a first analysis device Application to enter a morphological analysis of a a string (phrase) in a particular language perform. This first analysis device is a downstream of the second analysis device, in which a syntactic analysis of the entered string (word follow) on the basis of the analysis result of the first Analysis device is performed. In such a Speech analyzer need very large amounts of data in complex Data structures are transmitted and forwarded, so that, among other things unwieldy big word trees arise and in addition also wrong decisions in the lower Ver work stages that can not occur afterwards can be corrected.

Aus der Literaturstelle PAULUS, E.; WOLF, D; NTG-Fachbe richte 94 Sprachkommunikation; VDE-Verlag GmbH, Berlin, Of fenbach; Juli 1986; Seiten 209-214 (ISBN 3-8007-1465-5) ist ein Vollsynthesesystem für die deutsche Sprache bekannt, bei welchem ein Transkriptionsprogramm durch Einbau einer automatischen morphologischen Analyse realisiert werden kann. Dieses bekannte System basiert auf einem umfassenden Morphlexikon und einem Satz von Regeln zur Strukturbe schreibung deutscher Wortformen. Das dabei erhaltene Analy seergebnis soll zum einen die Transkription im engeren Sinn vereinfachen, da bei Kenntnis der Morphgrenzen eine direkte, regelhafte Graphem-Phonem-Umsetzung erfolgen kann und zum anderen das Analyseergebnis die Grundlage für eine ein fache syntaktische und prosodische Analyse darstellt. Dar über hinaus können Wortbetonung, Sprachrhythmus und Satzme lodie abgeleitet werden und es kann somit die Qualität der synthetischen Sprache entschieden verbessert werden.From the reference PAULUS, E .; WOLF, D; NTG Fachbe report 94 voice communication; VDE-Verlag GmbH, Berlin, Of fenbach; July 1986; Pages 209-214 (ISBN 3-8007-1465-5) a full synthesis system for the German language known in which a transcription program by incorporating a automatic morphological analysis can be realized can. This known system is based on a comprehensive Morpheme dictionary and a set of rules for structur description of German word forms. The resulting analysis The result of this is, on the one hand, the transcription in the narrower sense since knowing the morphological boundaries gives a direct, regular grapheme-phoneme conversion can be done and on the other hand the analysis result is the basis for a represents multiple syntactic and prosodic analysis. Dar In addition, word stress, speech rhythm and sentence me Thus, the quality of the product can be deduced decidedly improved synthetic language.

Die der Erfindung zugrundeliegende Aufgabe besteht darin, einen Sprachanalysator zu schaffen, bei dem die Datenmengen in Verbindung mit den Wort-Bäumen (mögliche Wortfolgen mit unterschiedlichen Wahrscheinlichkeiten für jeden Zweig) we sentlich reduziert werden können, so daß dadurch die Daten verarbeitung einfacher bzw. in geringerem Umfang durchge führt werden kann.The object underlying the invention is to to provide a speech analyzer, where the amounts of data in conjunction with the word trees (possible word sequences with different probabilities for each branch) we can be considerably reduced, so that thereby the data Processing easier or to a lesser extent carried out can be led.

Diese Aufgabe wird erfindungsgemäß durch die im Anspruch 1 aufgeführten Merkmale gelöst. This object is achieved by the in claim 1 solved features.

Besonders vorteilhafte Ausgestaltungen und Weiterbildungen der Erfindung ergeben sich aus den Unteransprüchen.Particularly advantageous embodiments and developments The invention will become apparent from the dependent claims.

Im folgenden wird die Erfindung anhand von Ausführungs beispielen unter Hinweis auf die Zeichnung näher erläutert. Es zeigen: In the following the invention is based on execution Examples with reference to the drawing explained in more detail. Show it:

Fig. 1 bis 10 eine erste Ausführungsform eines Sprachana lysators mit Merkmalen nach der Erfindung, der bei einer automatischen Übersetzungseinrichtung Englisch- Japanisch angewendet ist, wobei im einzelnen zeigen Fig. 1 to 10 a first embodiment of a Sprachana analyzer with features of the invention, which is applied to an automatic translator English Japanese, showing in detail

Fig. 1 ein Funktionsblockdiagramm eines Beispiels für den detaillierten Aufbau eines morphologischen Analyseabschnitts; Fig. 1 is a functional block diagram of an example of the detailed structure of a morphological analysis section;

Fig. 2 ein Funktionsdiagramm des gesamten Aufbaus; Fig. 2 is a functional diagram of the entire structure;

Fig. 3 eine erläuternde Übersicht, anhand welcher ein Beispiel für den Aufbau einer Wörterbuchdatei wiedergegeben ist, welche mit einem höchsten Vorzugsflag versehen ist; Fig. 3 is an explanatory diagram showing an example of the structure of a dictionary file provided with a highest preference flag;

Fig. 4 ein Flußdiagramm eines Beispiels einer morphologischen Analyse; Fig. 4 is a flow chart of an example of a morphological analysis;

Fig. 5 ein Flußdiagramm eines Beispiels für die Eingabeverarbeitung bei der morphologischen Analyse; Fig. 5 is a flow chart of an example of the input processing in the morphological analysis;

Fig. 6 eine erläuternde Übersicht eines Beispiels zum Formen einer eingegebenen Zeichenfolge; Fig. 6 is an explanatory diagram of an example of forming an input character string;

Fig. 7 eine erläuternde Übersicht, in welcher ein Beispiel für ein Wörterbuchabrufen dargestellt ist; Fig. 7 is an explanatory diagram showing an example of a dictionary retrieval;

Fig. 8A bis 8D Flußdiagramme, welche ein Beispiel darstellen, wie ein Widerspruch bei dem höchsten Vorzugsflag in der morphologischen Analyse beseitigt wird; Figs. 8A to 8D are flow charts, which will represent an example of how a contradiction at the highest Vorzugsflag in the morphological analysis eliminated;

Fig. 9 eine erläuternde Übersicht eines Beispiels für den Inhalt eines Puffers für wiederabgerufene Wörterbuch-Information nach einem Verweisen auf das Wörterbuch; Fig. 9 is an explanatory overview of an example of the contents of a retrieved dictionary information buffer after referencing the dictionary;

Fig. 10 eine erläuternde Übersicht eines Beispiels für den Inhalt des Puffers für wiederabgerufene Wörterbuch-Information als Ergebnis der Durchführung einer Beseitigung eines Gegensatzes für das höchste Vorzugsflag; Fig. 10 is an explanatory overview of an example of the contents of the retrieved dictionary information buffer as a result of performing elimination of a conflict for the highest preference flag;

Fig. 11 bis 16 eine zweite Ausführungsform mit Merkmalen nach der Erfin dung, wobei zeigen Fig. 11 dung a second embodiment having features to 16 after the OF INVENTION, in which show

Fig. 11 ein Blockdiagramm der Ausführungsform; Fig. 11 is a block diagram of the embodiment;

Fig. 12 eine Tabelle, in der ein Beispiel von Daten dargestellt ist, welche in einer Wörterbuchdatei gespeichert sind; Fig. 12 is a table showing an example of data stored in a dictionary file;

Fig. 13 eine Tabelle, in welcher ein Beispiel von Daten dargestellt ist, welche in einer als Grundeinheit dienenden Wörterbuch-Datei gespeichert sind; Fig. 13 is a table showing an example of data stored in a dictionary file serving as a basic unit;

Fig. 14 eine Tabelle, in welcher ein Beispiel von Daten dargestellt ist, welche in einer Wörterbuch- Informations-Konservierungstabelle gespeichert sind; Fig. 14 is a table showing an example of data stored in a dictionary information preservation table;

Fig. 15 ein Flußdiagramm der Arbeitsweise dieser Einrichtung; Fig. 15 is a flow chart of the operation of this device;

Fig. 16 ein Flußdiagramm der Erkennungseinheit; Fig. 16 is a flow chart of the recognition unit;

Fig. 17 bis 29 eine dritte Ausführungsform eines Sprach analysators mit Merkmalen nach der Erfindung, der bei einer automatischen Übersetzungseinrichtung Englisch-Japanisch angewendet ist, wobei zeigen Fig. 17 to 29 having features of the invention which is applied to an automatic transmission device English-Japanese, and show a third embodiment of a speech analyzer

Fig. 17 ein Funktionsblockdiagramm eines Beispiels für die detaillierte Struktur einer morphologischen Analyse; Fig. 17 is a functional block diagram of an example of the detailed structure of a morphological analysis;

Fig. 18 ein Funktionsblockdiagramm der gesamten Struktur; Fig. 18 is a functional block diagram of the entire structure;

Fig. 19A und 19B Flußdiagramme eines Beispiels einer morphologischen Analyse; Figs. 19A and 19B are flowcharts of an example of a morphological analysis;

Fig. 20 ein Flußdiagramm, in dem ein Beispiel einer kollektiven Anordnung für ein Währungssymbol und für die Einheit in der morphologischen Analyse dargestellt ist; Fig. 20 is a flow chart showing an example of a collective arrangement for a currency symbol and for the unit in the morphological analysis;

Fig. 21A und 21B Flußdiagramme eines Beispiels von mit Bindestrichen versehenen Ziffern in der morphologischen Analyse; Figs. 21A and 21B are flowcharts of an example of hyphenated digits in the morphological analysis;

Fig. 22A und 22B Flußdiagramme eines Beispiels zum Verarbeiten aufeinanderfolgender Zahlen in der morphologischen Analyse; Figs. 22A and 22B are flowcharts of an example of processing successive numbers in the morphological analysis;

Fig. 23A und 23B Flußdiagramme eines Beispiels einer kollektiven Anordnung mit einem vorangehenden numerischen Wert in der morphologischen Analyse; Figs. 23A and 23B are flowcharts of an example of a collective arrangement having a preceding numerical value in the morphological analysis;

Fig. 24 eine erläuternde Übersicht, in welcher ein Beispiel für die Struktur einer Wörterbuch- Datei dargestellt ist, die mit einem numerischen Flag versehen ist; Fig. 24 is an explanatory diagram showing an example of the structure of a dictionary file provided with a numerical flag;

Fig. 25 eine erläuternde Übersicht eines Beispiels einer eingegebenen Zeichenfolge; Fig. 25 is an explanatory diagram of an example of an input character string;

Fig. 26A bis 26D erläuternde Übersichten, in welcher der Inhalt der Wörterbuch-Informations-Konservierungstabelle dargestellt ist, die aus dem Wörterbuch für die eingegebene Zeichenfolge, die in Fig. 25 dargestellt ist, in Abhängigkeit von den Verarbeitungsschritten aufgefunden worden ist; Figs. 26A to 26D are explanatory diagrams showing the contents of the dictionary information preservation table found from the inputted character dictionary shown in Fig. 25 in response to the processing steps;

Fig. 27 eine erläuternde Übersicht eines weiteren Beispiels einer eingegebenen Zeichenfolge; Fig. 27 is an explanatory diagram of another example of an input character string;

Fig. 28 eine erläuternde Übersicht, in welcher der Inhalt einer Währungssymboltabelle einer Positionsnotierungstabelle und einer Dezimalpunkt- Tabelle in dem Wörterbuch dargestellt ist; Fig. 28 is an explanatory diagram showing the contents of a currency symbol table of a position notation table and a decimal point table in the dictionary;

Fig. 29A bis 29D erläuternde Ansichten, in welchen ein Beispiel der Wörterbuch-Informations-Konservierungstabelle dargestellt ist, die aus dem Wörterbuch für die eingegebene Zeichenfolge, die in Fig. 27 dargestellt ist, entsprechend den Verarbeitungsschritten aufgefunden worden ist; Figs. 29A to 29D are explanatory views showing an example of the dictionary information preservation table retrieved from the inputted character dictionary shown in Fig. 27 according to the processing steps;

Fig. 30 bis 36 eine vierte Ausführungsform eines Sprachanalysators mit Merkmalen nach der Erfindung, wobei zeigen Fig. 30 to 36 a fourth embodiment of a speech analyzer having features according to the invention, in which show

Fig. 30 ein Blockdiagramm dieser Ausführungsform; Fig. 30 is a block diagram of this embodiment;

Fig. 31 eine Tabelle, in welcher ein Beispiel von Daten wiedergegeben ist, welche in einem anderen Wörterbuch gespeichert sind; Fig. 31 is a table showing an example of data stored in another dictionary;

Fig. 32 ein Flußdiagramm der Arbeitsweise der gesamten Einrichtung; Fig. 32 is a flowchart of the operation of the entire apparatus;

Fig. 33 ein Flußdiagramm der Verarbeitung für den in einem Wörterbuch registrierten Eigennamen; Fig. 33 is a flowchart of processing for the dictionary-registered proper name;

Fig. 34 ein Flußdiagramm einer Verarbeitung für einen nicht in dem Wörterbuch registrierten Eigennamen; Fig. 34 is a flowchart of processing for a proper name not registered in the dictionary;

Fig. 35 ein Flußdiagramm einer Verarbeitung für mangelhafte Merkmalsinformation; Fig. 35 is a flowchart of defective feature information processing;

Fig. 36 eine Tabelle eines Beispiels, in welchem Daten, die in der Wörterbuch-Informations-Konservierungstabelle gespeichert sind, nach der Verarbeitung für den eingegebenen Satz geändert werden; Fig. 36 is a table showing an example in which data stored in the dictionary information preservation table is changed after the processing for the input sentence;

Fig. 37 bis 46 eine fünfte Ausführungsform eines Sprachanalysators mit Merkmalen nach der Erfindung, der bei einer automatischen Übersetzungseinrichtung für Englisch-Japanisch angewendet ist, wobei zeigen Figs. 37 to 46 show a fifth embodiment of a speech analyzer having features of the invention applied to an English-Japanese automatic translation device, wherein Figs

Fig. 37 ein Funktionsblockdiagramm eines Beispiels eines detaillierten Aufbaus für den morphologischen Analyseabschnitt; Fig. 37 is a functional block diagram of an example of a detailed construction of the morphological analysis section;

Fig. 38 ein Funktionsblockdiagramm des gesamten Aufbaus; Fig. 38 is a functional block diagram of the entire construction;

Fig. 39 eine erläuternde Übersicht einer Ausführungsform für den Aufbau einer Wörterbuch-Datei; Fig. 39 is an explanatory diagram of an embodiment for constructing a dictionary file;

Fig. 40 ein Flußdiagramm eines Beispiels einer morphologischen Analyse für einen Eigennamen; Fig. 40 is a flow chart of an example of a morphological analysis for a proper name;

Fig. 41 ein Flußdiagramm eines Beispiels der kollektiven Anordnung für Eigennamen, die in einem Wörterbuch für die morphologische Analyse registriert sind; Fig. 41 is a flow chart of an example of the collective name system of proper names registered in a dictionary for morphological analysis;

Fig. 42 bis 44 Flußdiagramme, eines Beispiels einer Verarbeitung in Abhängigkeit von der Positionsinformation in der morphologischen Analyse eines Eigennamens; Figs. 42 to 44 are flowcharts showing an example of processing depending on the position information in the morphological analysis of a proper name;

Fig. 45 ein Flußdiagramm eines Beispiels der kollektiven Anordnung für Eigennamen, die nicht in einem Wörterbuch registriert sind, in der morphologischen Analyse; Fig. 45 is a flow chart of an example of the collective arrangement of proper names not registered in a dictionary in the morphological analysis;

Fig. 46A bis 46F erläuternde Übersichten, welche den Inhalt der Wörterbuch-Informations-Konservierungstabelle darstellen, auf die in dem Wörterbuch beispielsweise für eine eingegebene Zeichenfolge entsprechend den Verarbeitungsschritten Bezug genommen ist; Figs. 46A to 46F are explanatory diagrams showing the contents of the dictionary information preservation table referred to in the dictionary, for example, for an input character string according to the processing steps;

Fig. 47 bis 52 eine sechste Ausführungsform eines Sprachanalysators mit Merkmalen nach der Erfindung, wobei zeigen Figs. 47 to 52 show a sixth embodiment of a speech analyzer with features according to the invention, wherein show

Fig. 47 ein Blockdiagramm dieser Ausführungsform; Fig. 47 is a block diagram of this embodiment;

Fig. 48 eine Tabelle beispielsweise von Daten, die in einem Bezugswörterbuch gespeichert sind; Fig. 48 is a table of, for example, data stored in a reference dictionary;

Fig. 49 ein Flußdiagramm der Arbeitsweise der gesamten Einrichtung; Fig. 49 is a flowchart of the operation of the entire apparatus;

Fig. 50 ein Flußdiagramm der Verarbeitung für die Eigennamen, die in dem Wörterbuch gespeichert sind; Fig. 50 is a flowchart of the processing for the proper names stored in the dictionary;

Fig. 51 ein Flußdiagramm der Verarbeitung für Eigennamen, die nicht in dem Wörterbuch registriert sind; Fig. 51 is a flow chart of processing for proper names not registered in the dictionary;

Fig. 52 eine Tabelle, in welcher als Beispiel Daten, die in einer Wörterbuch-Informations-Konservierungstabelle gespeichert sind, nach der Verarbeitung des eingegebenen Satzes geändert werden, Fig. 52 is a table in which, for example, data stored in a dictionary information preservation table is changed after the processing of the input sentence;

Fig. 53 bis 57 eine siebte Ausführungsform eines Sprachanalysators mit Merkmalen nach der Erfindung, der in Verbindung mit einer automatischen Übersetzungseinrichtung Englisch-Japanisch angewendet ist, wobei aufweisen Figs. 53 to 57 show a seventh embodiment of a speech analyzer having features of the invention applied to an English to Japanese automatic translation device, wherein Figs

Fig. 53 ein Funktionsblockdiagramm einer Ausführungsform für die detaillierte Struktur des morphologischen Analyseabschnitts; Fig. 53 is a functional block diagram of one embodiment of the detailed structure of the morphological analysis section;

Fig. 54 ein Funktionsblockdiagramm des gesamten Aufbaus; Fig. 54 is a functional block diagram of the entire structure;

Fig. 55A und 55B Flußdiagramme eines Beispiels der morphologischen Analyse,FIGS . 55A and 55B are flowcharts showing an example of the morphological analysis;

Fig. 56 eine erläuternde Übersicht, in welcher ein Beispiel des Inhalts der Informationstabelle in einem Informationsverarbeitungsabschnitt dargestellt ist; Fig. 56 is an explanatory diagram showing an example of the content of the information table in an information processing section;

Fig. 57 eine erläuternde Übersicht eines Beispiels des Inhalts einer Anpassungstabelle (7128); Fig. 57 is an explanatory overview of an example of the contents of a fitting table (7128);

Fig. 58 bis 63 eine achte Ausführungsform eines Sprachanalysators mit Merkmalen nach der Erfindung, wobei zeigen Figs. 58 to 63 show an eighth embodiment of a speech analyzer having features according to the invention, wherein Figs

Fig. 58 ein Blockdiagramm zur Erläuterung des gesamten Aufbaus; Fig. 58 is a block diagram for explaining the entire construction;

Fig. 59 ein Blockdiagramm zur Erläuterung eines Beispiels einer Verarbeitung von abgeleiteten Worten mittels einer Vorsilbe; Fig. 59 is a block diagram for explaining an example of processing of derived words by means of a prefix;

Fig. 60 ein Blockdiagramm zur Erläuterung eines Beispiels einer Verarbeitung von abgeleiteten Worten mittels einer Nachsilbe; Fig. 60 is a block diagram for explaining an example of processing of derived words by means of a suffix;

Fig. 61 ein Blockdiagramm der gesamten Einzelheiten, die durch Zusammenfügen der Fig. 58 bis 60 entstanden sind; Fig. 61 is a block diagram of the entire details obtained by combining Figs. 58 to 60;

Fig. 62 ein Blockdiagramm von weiteren Einzelheiten für einen vollständigen nicht-registrierten Wortverarbeitungsabschnitt in Fig. 61; Fig. 62 is a block diagram showing further details of a complete unregistered word processing section in Fig. 61;

Fig. 63 ein Blockdiagramm zur Erläuterung einer Ausführungsform einer automatischen Übersetzungseinrichtung, in welcher die Erfindung angewendet ist; Fig. 63 is a block diagram for explaining an embodiment of an automatic translation device in which the invention is applied;

Fig. 64 bis 90 eine neunte Ausführungsform eines Sprachanalysators mit Merkmalen nach der Erfindung, der in einer Übersetzungseinrichtung Englisch-Japanisch angewendet ist, wobei zeigen Figs. 64 to 90 show a ninth embodiment of a speech analyzer having features of the invention applied to an English-Japanese translator, wherein Figs

Fig. 64 ein Funktionsblockdiagramm des gesamten Aufbaus; Fig. 64 is a functional block diagram of the entire construction;

Fig. 65 ein Funktionsblockdiagramm, das zusammengefaßt die Funktion zum Erkennen der strukturellen Anordnung eines eingegebenen englischen Satzes als einen Block darstellt; Fig. 65 is a functional block diagram summarizing the function of recognizing the structural arrangement of an input English sentence as a block;

Fig. 66 ein Flußdiagramm eines Beispiels eines Flusses für die kollektive Anordnung eines Blockes bezüglich des eingegebenen Satzes; Fig. 66 is a flow chart of an example of a flow for collective arrangement of a block with respect to the input sentence;

Fig. 67 ein Flußdiagramm von Einzelheiten für eine Wortverarbeitung in dem Verarbeitungsfluß; Fig. 67 is a flowchart of details for word processing in the processing flow;

Fig. 68 eine erläuternde Übersicht eines Beispiels der Wörterbuchinformation für englische Wörter oder Phrasen, die in einem Wort-Wörterbuch gespeichert sind; Fig. 68 is an explanatory overview of an example of dictionary information for English words or phrases stored in a word dictionary;

Fig. 69 eine erläuternde Übersicht eines Beispiels der Tabellendaten für den Block-Anfangszustand- Endzustand und für einen Zweck- und Rollenbewertungszustand, welcher in einer Analysedatei gespeichert ist; Fig. 69 is an explanatory overview of an example of the table data for the block initial state final state and for a purpose and roll evaluation state stored in an analysis file;

Fig. 70 eine erläuternde Übersicht eines Beispiels einer kollektiven Anordnung für eine Struktur; Fig. 70 is an explanatory diagram of an example of a collective arrangement of a structure;

Fig. 71 eine erläuternde Übersicht eines Beispiels einer kollektiven Anordnung für einen Block; Fig. 71 is an explanatory diagram of an example of a collective arrangement for a block;

Fig. 72 eine erläuternde Übersicht für ein Beispiel von englischer Information und Wort-Information, welche kollektiv in einem Block angeordnet sind; Fig. 72 is an explanatory chart for an example of English information and word information collectively arranged in a block;

Fig. 73 ein Flußdiagramm eines Beispiels einer Analyseverarbeitung, welche in einem entsprechenden Analyseabschnitt durchgeführt worden ist; Fig. 73 is a flowchart showing an example of analysis processing performed in a corresponding analysis section;

Fig. 74 eine erläuternde Übersicht, welche derjenigen in Fig. 68 ähnlich ist, wobei ein Beispiel für den Zugang zu einem Wort-Phrasen-Wörterbuch für den Fall dargestellt ist, daß diese Ausführungsform eine Funktion einer identischen Fallbewertung hat; Fig. 74 is an explanatory chart similar to that in Fig. 68, showing an example of access to a word phrase dictionary in the case that this embodiment has a function of an identical case score;

Fig. 75 eine erläuternde Übersicht, welche derjenigen in Fig. 69 ähnlich ist, wobei ein Beispiel für den Anfangs- und Endzustand eines Blockes und eine Tabelle der Blockvorbereitungsinformation für einen Fall dargestellt ist, daß diese Ausführungsform eine Funktion einer identischen Fallbewertung hat; Fig. 75 is an explanatory chart similar to that in Fig. 69, showing an example of the start and end state of a block and a table of block preparation information in a case that this embodiment has a function of an identical case evaluation;

Fig. 76 ein Funktionsblockdiagramm, das demjenigen in Fig. 64 ähnlich ist und den Gesamtaufbau einer Modifikation dieser Ausführungsform darstellt; Fig. 76 is a functional block diagram similar to that in Fig. 64, illustrating the overall construction of a modification of this embodiment;

Fig. 77 ein Funktionsblockdiagramm, das demjenigen in Fig. 76 ähnlich ist, um die Funktion einer grammatikalischen Analyse einer (let)-Information im Hinblick auf die modifizierte, in Fig. 76 dargestellte Ausführungsform zusammenzufassen; Fig. 77 is a functional block diagram, similar to that in Fig. 76, for summarizing the function of grammatical analysis of (let) information in view of the modified embodiment shown in Fig. 76;

Fig. 78 eine erläuternde Darstellung eines Beispiels für Wörterbuchinformation, die die (let)-Information für englische Worte und Phrasen enthält, welche in dem Wortspeicher in der modifizierten Ausführungsform gespeichert sind; Fig. 78 is an explanatory diagram of an example of dictionary information including the English words and phrases (let) information stored in the word memory in the modified embodiment;

Fig. 79 und 80 erläuternde Darstellungen, welche denjenigen in Fig. 72 ähnlich sind und ein Beispiel der Block- und Wortinformation darstellen, wobei ein englischer Satz, welcher (let)-Information enthält, kollektiv in einem Block angeordnet ist; Figs. 79 and 80 are explanatory diagrams similar to those in Fig. 72 and illustrating an example of the block and word information, wherein an English sentence containing (let) information is collectively arranged in a block;

Fig. 81 ein Flußdiagramm, das demjenigen in Fig. 73 ähnlich ist, und ein Beispiel eines Flusses für eine kollektive Anordnung der (let)-Information bezüglich des eingegebenen englischen Satzes darstellt; Fig. 81 is a flowchart, similar to that in Fig. 73, illustrating an example of a flow for collective arrangement of the (let) information with respect to the input English sentence;

Fig. 82 ein Flußdiagramm, das demjenigen in Fig. 73 ähnlich ist, und ein Beispiel für die Analyseverarbeitung darstellt, welche (let)-Information enthält, welche in dem Analyse-Abschnitt in der modifizierten Ausführungsform durchgeführt worden ist; Fig. 82 is a flowchart, similar to that in Fig. 73, illustrating an example of the analysis processing including (let) information performed in the analysis section in the modified embodiment;

Fig. 83A und 83B Flußdiagramme, welche ein Beispiel des Flusses darstellen, um die (let)-Information für den eingegebenen englischen Satz grammatikalisch zu analysieren; Figs. 83A and 83B are flowcharts showing an example of the flow for grammatically analyzing the (let) information for the input English sentence;

Fig. 84 ein Flußdiagramm eines Beispiels für den Fluß der grammatikalischen Analyse für mit Bindestrichen versehene Worte für den eingegebenen englischen Satz, Fig. 84 is a flow chart of an example of the flow of the parsing grammatical analysis for the input English sentence;

Fig. 85 eine erläuternde Übersicht, welche derjenigen in Fig. 72 ähnlich ist, und ein Beispiel der Block- und der Wortinformation darstellt, die kollektiv für den eingegebenen englischen Satz angeordnet sind, der ein Bindestrich- Wort in einem Block enthält; Fig. 85 is an explanatory chart similar to that in Fig. 72, illustrating an example of the block and word information collectively arranged for the input English sentence containing a hyphenated word in a block;

Fig. 86 ein Funktionsblockdiagramm, das demjenigen in Fig. 64 ähnlich ist, und den gesamten Aufbau einer anderen modifizierten Ausführungsform darstellt; Fig. 86 is a functional block diagram similar to that in Fig. 64 and showing the entire structure of another modified embodiment;

Fig. 87 ein Funktionsblockdiagramm, das demjenigen in Fig. 65 ähnlich ist, in welchem eine Funktion einer morphologischen Analyse der Zusatzfrage in dem eingegebenen englischen Satz kollektiv in der modifizierten, in Fig. 86 dargestellten Ausführungsform angeordnet ist; Fig. 87 is a functional block diagram similar to that in Fig. 65, in which a function of a morphological analysis of the supplementary question in the input English sentence is collectively arranged in the modified embodiment shown in Fig. 86;

Fig. 88 und 89 erläuternde Übersichten, welche derjenigen in Fig. 72 ähnlich sind und ein Beispiel einer kollektiv angeordneten Block- und Wort-Information für einen englischen Satz darstellen, der eine Zusatzfrage in einem Block enthält; und Figs. 88 and 89 are explanatory views similar to those in Fig. 72 and illustrating an example of collectively arranged block and word information for an English sentence including a supplemental question in a block; and

Fig. 90A und 90B Flußdiagramme eines Beispiels einer Analysefolge einer Zusatzfrage für den eingegebenen englischen Satz. Figs. 90A and 90B are flowcharts showing an example of a sequence of analysis of a supplementary question for the input English sentence.

Nunmehr wird die erste Ausführungsform mit Merkmalen nach der Erfindung be schrieben. In Fig. 1 ist der Gesamtaufbau der ersten Ausführungsform dargestellt, in welcher ein Sprachanalysator in einer automatischen Übersetzungseinrichtung Englisch-Japanisch angewendet ist. Die Erfindung kann selbstverständlich genauso effektiv nicht nur bei einer automatischen Übersetzungseinrichtung zum Übersetzen von Englisch in Japanisch, sondern auch bei irgendwelchen Sprachanalysatoren angewendet werden, bei welchen die Sätze einer eingegebenen Sprache hauptsächlich analysiert werden, um eine bestimmte Sprache in eine andere zu übersetzen.Now, the first embodiment will be described with features according to the invention be. In Fig. 1, the entire structure of the first embodiment is shown, in which a speech analyzer is applied in an English-Japanese automatic translation device. Of course, the invention can be applied equally effectively not only to an automatic translator for translating English into Japanese, but also to any language analyzer in which the sentences of an input speech are mainly analyzed to translate one language into another.

Die Ausführungsform in Fig. 1 hat einen Eingabeabschnitt 1010, über welchen ein englischer Text 1012, welcher ins Japanische zu übersetzen ist, eingegeben wird. Der Eingabeabschnitt 1010 kann beispielsweise eine Tastatur mit Zeichentasten, wie alphanumerischen Tasten oder Funktionstasten, eine optische Zeichenleseeinrichtung (OCR) zum Lesen des auf Papier aufgezeichneten, englischen Textes und/oder einen Datei-Speicher aufweisen, um den englischen Text zu lesen, welcher auf einem Speichermedium, wie einer Magnetplatte aufgezeichnet ist. Der englische Text, welcher an dem Eingabeabschnitt 1010 eingegeben ist, wird in einen Vorredigierabschnitt 1014 gelesen, in welchem eine Vorbehandlung für die Übersetzung durchgeführt wird. In diesem Abschnitt werden hauptsächlich eine Satzerkennung und eine Verarbeitung von unbekannten Wörtern durchgeführt. Dies fungiert dann als ein Teil einer morphologischen Analyse.The embodiment in Fig. 1 has an input section 1010 through which an English text 1012 to be translated into Japanese is input. The input section 1010 may comprise , for example, a keyboard with character keys such as alphanumeric keys or function keys, an optical character reader (OCR) for reading the English text recorded on paper, and / or a file memory to read the English text which is printed on a paper Storage medium as a magnetic disk is recorded. The English text input at the input section 1010 is read into a pre-editing section 1014 in which pretreatment for the translation is performed. This section mainly performs sentence recognition and unknown word processing. This then acts as part of a morphological analysis.

Die vorredigierten englischen Daten werden zusammen mit Informationen, welche bei der Vorredigierung enthalten wurden, an einen ersten bzw. morphologischen Analyseabschnitt 1016 übertragen. Der Abschnitt 1016 teilt den Satz unter Bezugnahme auf ein Wort- Wörterbuch 1018, analysiert die Morpheme des englischen Satzes, führt verschiedene Anordnungsarten durch wie eine Verarbeitung für unbekannte Worte, einen Ausdruck für Zeit, einen Ausdruck für Zahlen usw. und führt eine Verarbeitung für den gesamten Satz durch wie eine zusätzliche Befragung und eine identische Fallerkennung. Die Vorschrift für die morphologische Analyse wird in einer Regel-Datei 1036 gespeichert.The pre-processed English data is transmitted to a first or morphological analysis section 1016 together with information contained in the pre-editing. The section 1016 divides the sentence by referring to a word dictionary 1018 , analyzes the morphemes of the English sentence, performs various types of arrangement such as unknown word processing, expression for time, expression for numbers, etc., and performs processing for the word Throughout the sentence as an additional survey and an identical case detection. The morphological analysis rule is stored in a rule file 1036 .

Die englischen Daten nach einer Morphem-Analyse werden zusammen mit der Wörterbuch-Information, die durch die morphologische Analyse erhalten worden ist, an einen zweiten Analyseabschnitt bzw. Parsing-Abschnitt I 1020 übertragen. (Hierbei wird nachstehend unter Parsing eine grammatikalische Analyse bzw. eine automatische Syntaxanalyse verstanden). Der Abschnitt I 1020 ist ein Funktionsabschnitt, welcher eine Analyse für die Oberflächenstruktur eines englischen Satzes durchführt, indem eine Grammatikregel bei den englischen Daten angewendet wird, und es werden dann alle strukturellen Möglichkeiten herausgefunden.The English data after a morpheme analysis is transmitted to a second parsing section I 1020 together with the dictionary information obtained by the morphological analysis. (Parsing is understood here to mean a grammatical analysis or an automatic syntax analysis). Section I 1020 is a functional section that performs an analysis on the surface structure of an English sentence by applying a grammar rule to the English data, and then finds out all the structural possibilities.

Die englischen Daten nach der Analyse in dem Abschnitt I 1020 werden zusammen mit dessen Analyse-Information einem weiteren Analyse- Abschnitt II 1022 zugeführt. In diesem weiteren Abschnitt wird eine Lösung ausgewählt, indem eine Strukturbeschreibung ausgehend von dem Ergebnis der Analyse im Hinblick auf die Oberflächenstruktur durch den Abschnitt I angewendet wird. Ein annehmbarer "Parsing-Baum" für den englischen Satz wird auf diese Weise vorbereitet, um dessen Struktur zu bilden. Diese Parsing-Regeln sind in einer Parsing-Regel-Datei 1036 gespeichert.The English data after the analysis in the section I 1020 are fed together with its analysis information to another analysis section II 1022 . In this further section, a solution is selected by applying a structure description based on the result of the analysis with respect to the surface structure by the section I. An acceptable "parse tree" for the English sentence is thus prepared to form its structure. These parsing rules are stored in a parsing rule file 1036 .

Die englischen Daten werden nach der Analyse als Daten für einen "Parsing-Baum" an einen Struktur-Transformationsabschnitt 1034 übertragen. In dem Abschnitt 1034 wird ein entsprechender japanischer Satzbaum aus einem strukturellen Baum, d. h. einer Zwischenstruktur des englischen Satzes, vorbereitet und wird in eine dem japanischen zugrundeliegende Struktur umgesetzt, aus welcher dann japanisch leicht übersetzt werden kann.The English data is transmitted to a texture transform section 1034 as parsing tree data after analysis. In section 1034 , a corresponding Japanese sentence tree is prepared from a structural tree, ie an intermediate structure of the English sentence, and is translated into a Japanese underlying structure from which Japanese can then be easily translated.

Die strukturellen Baumdaten, welche die dem Japanischen zugrundeliegende Struktur zeigen, die auf diese Weise einer Strukturumwandlung unterzogen worden ist, werden an einen Übersetzungsabschnitt 1026 abgegeben, in welchem der übersetzte Satz gebildet wird. Dies ist ein funktioneller Abschnitt, um einen japanischen Satz aus der Baumstruktur des japanischen Strukturbaums zu erzeugen.The structural tree data showing the Japanese underlying structure which has undergone structural conversion in this manner is delivered to a translation section 1026 in which the translated sentence is formed. This is a functional section to create a Japanese sentence from the tree structure of the Japanese tree structure.

Die auf diese Weise übersetzten japanischen Satzdaten, d. h. übersetzte Satzdaten, werden dann in einen Nachredigierabschnitt 1030 abgegeben. Der Abschnitt 1030 modifiziert die übersetzten Satzdaten unter Bezugnahme auf das Wörterbuch 10128, wobei eine Information verwendet wird, welche bei der Übersetzung benutzt worden ist, um einen natürlicheren japanischen Satz zu vervollständigen. Die japanischen Satzdaten werden dann einem Ausgabeabschnitt 1032 übertragen und dann als der übersetzte japanische Satz 1034 von dem Ausgabeabschnitt 1032 aus abgegeben. Der Ausgabeabschnitt 1032 weist beispielsweise einen Drucker, ein Display und/oder eine Datei- Speichereinrichtung, wie eine Magnetplatte, auf.The Japanese sentence data translated in this way, ie, translated sentence data, is then output to a post-editing section 1030 . The section 1030 modifies the translated sentence data with reference to the dictionary 10128 , using information that has been used in the translation to complete a more natural Japanese sentence. The Japanese sentence data is then transmitted to an output section 1032 and then output as the translated Japanese sentence 1034 from the output section 1032 . The output section 1032 includes, for example, a printer, a display, and / or a file storage device such as a magnetic disk.

Der Fluß für die anschließenden Übersetzungsvorgänge wird durch einen Steuerabschnitt 1038 gesteuert, welcher die Steuerung für die gesamte Einrichtung vornimmt. Das Wort- Wörterbuch 1018 speichert Wörterbuchdaten für die Worte der englischen und japanischen Sprachen, wobei nicht nur das Vokabular, sondern auch verschiedene Informationen, wie eine Verknüpfungsbeziehung d. h. eine gleichzeitig bestehende Beziehung, Bedeutungen, Plural- und Singularformen, Sprachteile usw. in dieser Ausführungsform festgelegt werden. Ferner speichert die Datei 1036 die Regeldaten für die morphologische und syntaktische Analyse.The flow for the subsequent translation operations is controlled by a control section 1038 which controls the entire facility. The word dictionary 1018 stores dictionary data for the words of English and Japanese languages, specifying not only the vocabulary but also various information such as a link relationship, ie, a concurrent relationship, meanings, plural and singular forms, language parts, etc. in this embodiment become. Further, the file 1036 stores the morphological and syntactic analysis rules data.

Der Steuerabschnitt 1038 ist mit einem Bedienungs-Anzeigeabschnitt 1040 verbunden. Der Abschnitt 1040 hat Bedienungstasten, welche verschiedene Befehle von einem Operator an die Einrichtung geben, wie beispielsweise Übersetzungs-Befehlstasten oder Cursortasten, ein Display oder eine Anzeigeeinrichtung, welche visuell einen eingegebenen englischen Text, einen japanischen Satz als Ergebnis der Übersetzung, Zwischendaten, wie eine Wörterbuchinformation, verschiedene Befehle an den Operator, usw. anzeigt. Die meisten der Bedienungs-Anzeigefunktionen sind so ausgebildet, daß sie in einer Tastatur enthalten sind, wenn diese an dem Eingabeabschnitt 110 oder in einem Display angeordnet ist, falls dies an dem Ausgabeabschnitt 1032 vorgesehen ist.The control section 1038 is connected to an operation display section 1040 . The section 1040 has operation buttons which give various commands from an operator to the device, such as translation command keys or cursor keys, a display or display which visually displays an input English text, a Japanese sentence as a result of the translation, intermediate data such as a Display dictionary information, various commands to the operator, etc. Most of the operation display functions are arranged to be contained in a keyboard when it is disposed on the input section 110 or in a display, if provided at the output section 1032 .

In Fig. 1 ist der detaillierte Aufbau für den morphologischen Analyseabschnitt 1016 dargestellt. Der Abschnitt 1016 hat eine Eingabeeinheit 1100, nämlich eine Tastatur für den Eingabeabschnitt 1010 und ein Eingabe-Interface 1104, das ein Interface mit der Datei 1102 für eingegebene Vorlagen darstellt. Das Eingabeinterface 1104 ist mit einem Puffer für eingegebene Zeichenreihen versehen, die zusammen mit den Daten für englische Zeichenreihe in Form von Kodedaten, beispielsweise ASCII von der Eingabeeinheit 1100 oder von der Datei 1102 eingegeben werden, und speichert vorübergehend die Zeichenreihen-Daten. Die eingegebene Zeichenreihe kann diejenige sein, die in dem Abschnitt 1014 vorredigiert worden ist. FIG. 1 shows the detailed structure of the morphological analysis section 1016 . The section 1016 has an input unit 1100 , namely a keyboard for the input section 1010 and an input interface 1104 , which interfaces with the input template file 1102 . The input interface 1104 is provided with a buffer for inputted character strings inputted together with the English character string data in the form of code data such as ASCII from the input unit 1100 or the file 1102 , and temporarily stores the character string data. The input character string may be the one pre-edited in the portion 1014 .

Der erste bzw. morphologische Analyse-Abschnitt 1016 weist, wie in Fig. 1 dargestellt ist, einen Verarbeitungsabschnitt 1106, einen Wörterbuch-Bezugsabschnitt 1108, einen Widerspruchs-Beseitigungs- Verarbeitungsabschnitt 1110 und einen Steuerabschnitt 1112 auf. Der Verarbeitungsabschnitt 1106 ist ein Parsing-Funktionsabschnitt zum Durchführen der morphologischen Analyse und weist einen Puffer für abgefragte Wörterbuch- Informationen, d. h. eine Wörterbuch-Informations-Konservierungstabelle 1110 (siehe Fig. 9) auf. Die morphologische Analyse wird dadurch durchgeführt, daß das Abfragen im Wörterbuch ordnungsgemäß von dem oberen Ende der eingegebenen Zeichenreihe entsprechend der Abfrageschlüssel- Zeichenreihe befohlen wird und daß die Wörterbuch-Information, die aus dem Wörterbuch-Abfrageabschnitt 1108 dementsprechend erhalten worden ist, in den Puffer 1120 für abgefragte Wörterbuch-Information gespeichert wird, und daß die Verarbeitung eines Vorzugsgrades entsprechend dem höchsten Vorzugsflag durchgeführt wird, wie später noch beschrieben wird.As shown in FIG. 1, the first morphological analysis section 1016 includes a processing section 1106 , a dictionary reference section 1108 , a discard removal processing section 1110, and a control section 1112 . The processing section 1106 is a parsing function section for performing the morphological analysis and has a buffer for retrieved dictionary information, ie, a dictionary information preservation table 1110 (see FIG. 9). The morphological analysis is performed by properly instructing the dictionary retrieval from the upper end of the input character string corresponding to the interrogation key string and that the dictionary information obtained from the dictionary interrogation portion 1108 is written into the buffer 1120 is stored for retrieved dictionary information, and that the processing of a preferential degree is performed according to the highest preference flag, as will be described later.

Der Wörterbuch-Abfrageabschnitt 1108 ist ein Funktionsabschnitt, um die Wörterbuch-Information durch Abfragen des Wort-Wörterbuchs 1018 basierend auf der Abfrageschlüssel- Zeichenreihe herauszunehmen, was von dem Verarbeitungsabschnitt 1106 befohlen worden ist, und um diese dann an den Verarbeitungsabschnitt 1106 zu übertragen.The dictionary query section 1108 is a function section for taking out the dictionary information by querying the word dictionary 1018 based on the query key string, which has been commanded by the processing section 1106 and then transmitting it to the processing section 1106 .

Das Wort-Wörterbuch 1018 speichert Grammatikinformation, wie einen Sprachteil und eine Beugung für den Zugang zu jedem der Worte sowie ein höchstes Vorzugsflag, wie in Fig. 3 für das Beispiel der Zugangsinformation dargestellt ist. Das Wörterbuch ist als eine Wörterbuchdatei mit einem höchsten Vorzugsflag vorgesehen. "Das höchste Vorzugsflag" ist ein Flag, welches den Kopplungsumfang zwischen Worten anzeigt, welche in einem zusammengesetzten Wort oder einer Phrase enthalten sind, die den Wörterbuch-Zugang darstellt, in welcher "0" eine schwache Kopplung oder keine Kopplung anzeigt, während "1" eine starke Kopplung anzeigt. In diesem Fall wird der Sprachgebrauch als eine Phrase für ein zusammengesetztes Wort oder eine Phrase bewertet, die gemäß der Beurteilung eine starke Kopplung hat; andererseits wird die Möglichkeit des Gebrauchs in Form von einzelnen Worten parallel dazu auch in Betracht gezogen.The word dictionary 1018 stores grammar information such as a part of speech and a diffraction for access to each of the words as well as a highest preference flag as shown in Fig. 3 for the example of access information. The dictionary is intended as a dictionary file with a highest preference flag. "The highest preferred flag" is a flag indicating the amount of coupling between words contained in a compound word or phrase representing the dictionary access in which "0" indicates a weak coupling or no coupling, while "1" indicates "indicates a strong coupling. In this case, the linguistic usage is evaluated as a phrase for a compound word or phrase which, according to the judgment, has a strong coupling; On the other hand, the possibility of use in the form of individual words is also considered in parallel.

Wie in Fig. 3 veranschaulicht, ist jeder der Zugänge in das Wort-Wörterbuch 1018 jeweils für das zusammengesetzte Wort, eine Phrase und einzelne Wörter, welche sie bilden, möglich, wobei kein Unterschied zwischen den einzelnen Wörtern und dem zusammengesetzten Worten oder der Phrase gemacht ist. Ferner stellt jede Beugungsform jeweils einen Zugang dar. Wenn es eine Anzahl Beugungsformen gibt, werden sie jeweils als verschiedene Eingänge registriert. Die Art der Beugung wird in dem Beugungsabschnitt angezeigt. Die Situation ist ähnlich bei dem Sprachteil, in welchem die Registrierung für eine Anzahl Sprachteile zugelassen wird, und eine Sprachteilinformation ist für jede von ihnen enthalten. Als weitere Information werden eine Berechenbarkeit oder eine Nicht- Berechenbarkeit für ein Hauptwort, ein transitives oder intransitives Verb oder ein übersetztes Wort usw. registriert.As illustrated in Figure 3, each of the entries into the word dictionary 1018 is possible for the compound word, phrase, and individual words that make up it, with no distinction made between the individual words and the compound word or phrase is. Furthermore, each diffraction form represents one access. If there are a number of diffraction forms, they are each registered as different inputs. The type of diffraction is displayed in the diffraction section. The situation is similar to the speech part in which the registration is permitted for a number of speech parts, and a speech part information is included for each of them. As further information, calculability or non-predictability for a noun, a transitive or intransitive verb or a translated word, etc. are registered.

Beispielsweise ist "get" (erhalten) eine Infinitiv-Form eines Verbs, und das höchste Vorzugsflag ist "0". Die Phrase "get up" (Aufstehen) ist eine Phrase für eine Infinitivform, und deren höchstes Vorzugsflag ist "1". Ferner hat eine Präpositionsgruppe "up to" das höchste Bezugsflag "1", aber eine Wortgruppe wie "white house" (weißes Haus) als das zusammengesetzte Wort hat das höchste Vorzugsflag von "0", und folglich zeigt das letztere, daß der Kopplungsgrad zwischen den Worten gering ist. In Fig. 3 gibt das Symbol ""ein Leerzeichen an.For example, "get" is an infinitive form of a verb, and the highest preference flag is "0". The phrase "get up" is a phrase for an infinitive form, and its highest preference flag is "1". Further, a preposition group "up to" has the highest reference flag "1", but a phrase such as "white house" as the compound word has the highest preference flag of "0", and thus the latter shows that the degree of coupling between the words are low. In Fig. 3, the symbol "" indicates a space.

Auf diese Weise enthält die Wörterbuch-Information, welche in dem Abfrageabschnitt 1108 abgefragt worden ist, das höchste Vorzugsflag. Falls "1" für das höchste Vorzugsflag für identische Zeichenreihen oder sich überdeckende Zeichenreihen gesetzt ist, muß ein derartiger Widerspruch beseitigt werden. In dem Abschnitt 1110 wird die Widerspruchsbeseitigung durchgeführt und die anschließende Verarbeitung, wobei bezüglich der Widerspruchs-Beseitigungsvorschrift auf das höchste Vorzugsflag Bezug genommen wird, das in der Datei 1036 gespeichert ist.In this way, the dictionary information which has been retrieved in the query section 1108 contains the highest preference flag. If "1" is set for the highest preference flag for identical character strings or overlapping character series, such a contradiction must be eliminated. In the section 1110 , the objection elimination is performed and the subsequent processing, referring to the objection elimination rule, the highest preference flag stored in the file 1036 .

Die Widerspruchs-Beseitigungsvorschrift wird in der vorliegenden Ausführungsform in der folgenden Reihenfolge (1) bis (3) angewendet, wobei eine Vorzugsauswahl durchgeführt wird.The Opposition Elimination Rule is in the present Embodiment in the following order (1) to (3), with a preferred selection performed becomes.

(1) phrase or word whose part of speech is a verb;
(2) composite word, phrase or word with many Word components;
(3) compound word, phrase or word used in the front part located in the sentence.

Der Gebrauch für das auf diese Weise ausgewählte Wort, d. h. die Parsing-Einheit wird als die aktive Information an dem Puffer 1120 für aufgefundene Wörterbuch-Information in dem Verarbeitungsabschnitt 1016 dargestellt. Die aktive Information zeigt, daß die Parsing-Einheit gültig oder wirksam ist, wenn sie "1" ist, während sie zeigt, daß deren Möglichkeit nicht gebraucht wird, wenn sie "0" ist.The use for the thus-selected word, ie the parsing unit, is represented as the active information at the dictionary information retrieval buffer 1120 in the processing section 1016 . The active information indicates that the parsing unit is valid or effective if it is "1" while showing that its possibility is not needed if it is "0".

Der Steuerabschnitt 1112 ist ein Funktionsabschnitt zum Regeln und Steuern der Arbeitsweise und der Verarbeitung in jedem der Funktionsabschnitte in dem morphologischen Analyseabschnitt 1016. Der Abschnitt kann in dem Steuerabschnitt 1038 enthalten sein, von welchem aus die Steuerung für die gesamte Einrichtung durchgeführt wird. Das Ergebnis der morphologischen Analyse wird mittels eines Ausgabe-Interface 1114 an den Parsing-Abschnitt I 1020 übertragen. Für den Fall, daß das Ergebnis nicht unmittelbar an den Parsing- Abschnitt I 1020 übertragen wird, wird es einmal in der Parsing-Eingabedatei 1116 und in der Parsing-Wörterbuchinformationsdatei 1118 gespeichert.The control section 1112 is a function section for controlling and controlling the operation and the processing in each of the functional sections in the morphological analysis section 1016 . The section may be included in the control section 1038 from which the control for the entire device is performed. The result of the morphological analysis is transmitted via an output interface 1114 to the parsing section I 1020 . In the event that the result is not transmitted directly to the parsing section I 1020 , it is once stored in the parsing input file 1116 and in the parsing dictionary information file 1118 .

Während in dieser Ausführungsform alle Worte, zusammengesetzten Worte und Phrasen, wobei von der Abzweig- Position der Wörterbuch-Bezugseinheit gestartet wird, bei einer morphologischen Analyse herausgenommen werden, werden Wörterbuch-Informationen, welche für einzelne Worte erhalten worden sind, welche das zusammengesetzte Wort oder die Phrase bilden, die als eine kollektive Einheit entsprechend dem höchsten Vorzugsflag beurteilt worden ist, ausgeschieden. Das heißt, der Umfang der Kopplung zwischen den Worten in dem Satz wird beurteilt, während auf das höchste Bezugsflag für die Wörterbuch-Information Bezug genommen wird, welche bei der morphologischen Analyse erhalten worden ist. Bei den zusammengesetzten Worten oder Phrasen, die als solche beurteilt worden sind, die eine starke Kopplung haben, wird dies dann so bewertet, daß sie als Phrase in dem Satz verwendet werden; wenn nicht, wird auch die Möglichkeit für einen Gebrauch in Form von einzelnen Worten parallel dazu in Betracht gezogen. Eine solche Verarbeitung durch das höchste Vorzugsflag wird durch die in Fig. 4 dargestellte Folge durchgeführt. Daten für die eingegebenen Zeichenreihen werden von dem Eingabeabschnitt 1010 (1200) aufgenommen; die eingegebene Zeichenreihe wird durch eine Wörterbuch-Bezugseinheit für ein Wiederabfragen in der Wörterbuch-Datei 1018 mit dem höchsten Vorzugsflag (1201) versehen; das Wörterbuch 1018 wird dann dementsprechend wieder abgefragt (1203) und zwar bis zur Endposition des durch die Daten dargestellten Satzes für die eingegebene Zeichenreihe (1202); dann wird der Widerspruch für das höchste Vorzugsflag beseitigt, (1204) und das Ergebnis der morphologischen Analyse wird an den Parsing-Abschnitt I (1205) abgegeben.In this embodiment, while all the words, compound words and phrases starting from the branching position of the dictionary reference unit are taken out in a morphological analysis, dictionary information obtained for individual words containing the compound word or phrase form the phrase that has been judged to be a collective entity according to the highest preference flag. That is, the amount of coupling between the words in the sentence is judged while referring to the highest dictionary information reference flag obtained in the morphological analysis. The compound words or phrases that have been judged as having strong coupling are then judged to be used as a phrase in the sentence; if not, the possibility for use in the form of single words is considered in parallel. Such processing by the highest preference flag is performed by the sequence shown in FIG . Data for the input character strings are picked up by the input section 1010 ( 1200 ); the input character string is provided with a dictionary reference unit for retrieval in the dictionary file 1018 having the highest preference flag ( 1201 ); the dictionary 1018 is then interrogated again ( 1203 ) until the end position of the data string represented by the data for the input character string ( 1202 ); then the contradiction for the highest preference flag is eliminated ( 1204 ) and the result of the morphological analysis is given to the parsing section I ( 1205 ).

In der Eingabeverarbeitung (1200) werden die Daten zuerst aus der Datei 1102 oder einer Eingabeeinheit 1100 in den Puffer für eingegebene Zeichenreihen des Eingabe-Interface 1104 gelesen (siehe Fig. 5: 1210). Die Daten für die eingegebene Zeichenreihe werden beispielsweise in der Form von ASCII eingegeben; wenn Daten in der Datei vollständig ausgelesen sind, (beispielsweise wenn das Symbol EOF ausgelesen wird) schreibt der Verarbeitungsabschnitt 1106 einen NULL- Kode in den Puffer für die eingegebene Zeichenreihe als die Schlußposition.In the input processing ( 1200 ), the data is first read from the file 1102 or an input unit 1100 into the input character set buffer 1104 (see Fig. 5: 1210 ). For example, the data for the input character string is input in the form of ASCII; When data in the file is completely read out (for example, when the symbol EOF is read out), the processing section 1106 writes a NULL code in the buffer for the input character string as the ending position.

Der Verarbeitungsabschnitt 1106 formt dann die eingegebene Zeichenreihe wieder (1211). Wenn beispielsweise zwei oder mehr Zeichen, die zu einem einem Leerzeichen entsprechenden Zeichen gehören fortgesetzt werden, werden sie korrigiert in einem einzigen Leerzeichen angeordnet. Das einem Zwischenraum entsprechende Zeichen enthält Leerzeichen (welche durch das Symbol"" dargestellt sind) Tabulieren, Zeilenrücklauf (was durch das Symbol ↙ dargestellt ist) usw. Diese einem Zwischenraum entsprechenden Zeichen zwischen dem oberen und dem ersten erscheinenden Zeichen, ausgenommen das einem Zwischenraum entsprechende Zeichen in dem Puffer für die eingegebene Zeichenreihe werden entfernt.The processing section 1106 then reforms the input character string ( 1211 ). For example, if two or more characters belonging to a character corresponding to a space are continued, they will be corrected in a single space. The space-corresponding character includes spaces (represented by the symbol ""), tabbing, line-back (represented by the symbol ↙), etc. These space-corresponding characters between the upper and the first appearing characters except the space corresponding one Characters in the buffer for the input string are removed.

Beispielsweise wird die eingegebene Zeichenreihe oder Anordnung
"I-willgetup↙togotoawhite-
house...", wie in Fig. 6 dargestellt, in
"Iwillgetuptogotoa whitehouse...
(NULL)" umgeformt. Die Position des Symbols "NULL" zeigt die End- oder Schlußposition des Speichers an.For example, the input string or arrangement
"I willgetup↙togotoawhite-
house ... ", as shown in Fig. 6, in
"Iwillgetuptogotoa whitehouse ...
(NULL) "The position of the" NULL "symbol indicates the end or end position of the memory.

Die Wörterbuch-Bezugsabgrenzungen, welche für die herausgenommene Verarbeitung 1201 der Wörterbuch-Bezugseinheit verwendet worden sind, werden an der Stelle eines alphabetischen Zeichens, eines numerischen Zeichens, eines Apostrophs und anderer Zeichen außer Bindestrich und Absatz sowie außer einem Apostroph, welche auf Leerzeichen folgt, angeordnet. Der Verarbeitungsabschnitt 1106 hat einen oberen Zeiger für eine Wörterbuch-Referenz, welche zuerst an der obersten Stelle des Puffers gesetzt wird.The dictionary reference boundaries used for the removed dictionary processing unit processing 1201 are substituted for an alphabetic character, a numeric character, an apostrophe, and other characters except hyphen and paragraph, and an apostrophe followed by spaces, arranged. The processing section 1106 has an upper pointer for a dictionary reference, which is first set at the top of the buffer.

Der Abfrageabschnitt 1108 fragt die Wörterbuchdatei 1018, welche mit dem höchsten Vorzugsflag versehen ist, wobei die Zeichenreihe von dem Zeichen, das durch den oberen Zeiger angezeigt ist, bis zu dem Zeichen, das der nächsten Abgrenzung vorangeht, als die Abfrageschlüssel-Zeichenreihe verwendet wird. Der Wörterbuch-Zugang und die Abfrageschlüssel- Zeichenreihe werden verglichen; wenn beide identisch sind, wird die Wörterbuch-Information hereingelassen (1203). Die Reihe bzw. Anordnung wird beurteilt, wenn die gesamte Zeichenreihe des Zugangs mit zumindest einem Teil der Zeichenreihe übereinstimmt, die von dem oberen Ende aus startet und wenn der Teil unmittelbar nach diesem Teil eine Wörterbuch- Bezugsabgrenzung ein Apostroph oder ein Absatz ist. Wenn beispielsweise, wie in Fig. 7 dargestellt, der obere Zeiger das obere Zeichen "g" in der Abfrageschlüssel-Zeichenreihe anzeigt, stimmen "getupon" der Wörterbucheingabe hiermit überein.The query section 1108 queries the dictionary file 1018 provided with the highest preference flag, the character string being used from the character indicated by the upper pointer to the character preceding the next delimiter as the query key string. The dictionary access and the query key string are compared; if both are identical, the dictionary information is admitted ( 1203 ). The array is judged when the entire string of access matches with at least a part of the string starting from the top and when the part immediately after that part is a dictionary reference, an apostrophe or a paragraph. For example, as shown in Fig. 7, if the upper pointer indicates the upper character "g" in the query key string, "getupon" of the dictionary input is coincident thereto.

Die wieder abgefragte Wörterbuchinformation wird in dem Puffer 1120 des Verarbeitungsabschnittes 1106 gespeichert. Zusammen mit dem Lesen werden dann die übereinstimmende Startposition und die Endposition der Zeichenreihe gespeichert. Hierdurch wird die Position der Zeichen in dem Eingabepuffer ordentlich von dem oberen Ende an spezifiziert. Ein Häufungsbereich für die aktive Information ist in dem Puffer 1120 für abgefragte Wörterbuch-Information angeordnet, welches eine Information ist, die anzeigt, ob die abgefragte Wörterbuchinformation für die folgende Verarbeitung wirksam ist oder nicht, wobei bei diesem Schritt alles "1" gesetzt wird. The retrieved dictionary information is stored in the buffer 1120 of the processing section 1106 . Along with the reading, the matching start position and end position of the character string are then stored. This specifies the position of the characters in the input buffer properly from the upper end. An accumulation area for the active information is arranged in the retrieved dictionary information buffer 1120 , which is information indicating whether the retrieved dictionary information is effective for the following processing or not, in which step all "1" is set.

Anschließend wird der obere Zeiger bezüglich der jeweiligen Wörterbuch-Referenz auf den neuesten Stand gebracht und wird auf das Zeichen unmittelbar nach der Abgrenzung gesetzt, die dem vorhandenen oberen Zeiger am nächsten erscheint, welcher die Zeichenreihe von links nach rechts anzeigt. Die Wörterbuch- Referenz wird dann anschließend durchgeführt. In dem vorerwähnten Beispiel wird das Zeichen an der oberen Seite der Wörterbuch-Referenz zuerst als "I" für "I", dann als "w" für "will" und dann für "g" für "get" angezeigt. Wenn der obere Zeiger den NULL-Code durchläuft, wird beurteilt, ob das die Endposition ist (1202).Subsequently, the upper pointer is updated with respect to the respective dictionary reference, and is set to the character immediately after the delimitation which appears closest to the existing upper pointer indicating the character string from left to right. The dictionary reference is then performed. In the example mentioned above, the character at the top of the dictionary reference is displayed first as "I" for "I", then as "w" for "will" and then for "g" for "get". When the upper pointer goes through the NULL code, it is judged if this is the end position ( 1202 ).

In Fig. 9 ist ein Beispiel der Wörterbuch-Information dargestellt, welche auf diese Weise für das Beispiel der vorstehend beschriebenen, eingegebenen, englischen Zeichenreihe aufgefunden worden ist. Fig. 9 shows an example of the dictionary information which has thus been found for the example of the inputted English character string described above.

Nunmehr wird in Verbindung mit Fig. 8A bis 8D die Widerspruch- Beseitigungs-Verarbeitung 1204 beschrieben, welche mittels des entsprechenden Abschnitts 1110 durchgeführt wird. Das in Fig. 8A und 8B dargestellte Flußdiagramm stellt die Verarbeitung für den Fall dar, daß sich die Positionen der Worte, wo die höchsten Bezugsflags gesetzt sind, einander überlappen, während das Flußdiagramm in Fig. 8C und 8D die Verarbeitung darstellt, um die Parsing- Einheit, d. h. Elemente mit dem höchsten Vorzugsflag zu entfernen, d. h. eine Verarbeitung, um die aktive Information zu "0" zu machen. In diesen Flußdiagrammen stellt der Hinweis "≦" eine Substitution, das Zeichen "→" einen Hinweis und "P → x" den Inhalt von x dar, der durch die Eingabe des Zeigers p in Besitz genommen ist.Now, in conjunction with FIGS. 8A to 8D, the contradiction elimination processing 1204 performed by the corresponding section 1110 will be described. The flow chart shown in Figs. 8A and 8B represents the processing in the case where the positions of the words where the highest reference flags are set overlap each other, while the flow chart in Figs. 8C and 8D shows the processing to parse - Unit, ie elements with the highest preference flag to remove, ie a processing to make the active information to "0". In these flowcharts, the notation "≦" represents a substitution, the character "→" represents an indication, and "P → x" represents the content of x taken by the input of the pointer p.

Zuerst werden ein Satz Worte, von denen jedes mit dem höchsten Bezugsflag "1" versehen ist, und die Positionen in dem Satz, die einander überlappen, festgestellt (Schritte 1220 bis 1223). Dann wird die Eliminationsregel bezüglich des höchsten Bezugsflags bei jedem der festgestellten Sätze angewendet und diejenigen, bei denen es wirksam ist, werden ausgewählt (Schritte 1224 bis 1235).First, a set of words each having the highest reference flag "1" and the positions in the sentence overlapping each other are detected (steps 1220 to 1223 ). Then, the elimination rule regarding the highest reference flag is applied to each of the detected sentences, and those in which it is effective are selected (steps 1224 to 1235 ).

In der vorstehend beschriebenen Ausführungsform ist das höchste Vorzugsflag "1" bei "get up" an der Startposition "8" und an der Endposition "13" und bei "up to" an der Anfangsposition "12" und der Endposition "16" für die Zeichenfolge
"getupto" wie es in Fig. 9 dargestellt ist, und die Positionen für die Zeichen überlappen dann einander. Dann wird die Regel (1) die oben angeführt ist, zuerst angewendet und es wird beurteilt, ob es ein Verb ist oder nicht, wobei auf den Sprachteil des Konservierungszeigers psave und den Sprachteil des Zeigers p Bezug genommen wird (1224). Da es in diesem Beispiel einem Verb entspricht, wird die Kombination "get up" ausgewählt.In the embodiment described above, the highest preference flag "1" is at "start up" at the start position "8" and at the end position "13" and at "up to" at the initial position "12" and the end position "16" for the string
"getupto" as shown in Fig. 9, and the positions for the characters then overlap each other. Then, the rule (1) mentioned above is first applied, and it is judged whether it is a verb or not, referring to the speech part of the preservation pointer psave and the speech part of the pointer p ( 1224 ). Since it corresponds to a verb in this example, the combination "get up" is selected.

Wenn der Regel (1) nicht genügt ist, wird die Regel (2) angewendet (1228) und die Länge für die Zeichenanordnung, was sich auf die Eingabe des Konservierungszeigers psave bezieht und die Länge für die Zeichenanordnung was sich auf die Eingabe des Zeigers p bezieht, werden miteinander verglichen. Wenn der Regel (2) auch nicht genügt ist, dann wird die Regel (3) angewendet (1229) und der Positionsstart, welcher sich auf die Startposition für den Konservierungszeiger psave bezieht und der Positionsstart, der sich auf die Startposition des Zeiges p bezieht, werden verglichen. Wenn dann einer der Regeln (1) bis (3) genügt ist, wenn sie in dieser Reihenfolge angewendet werden, wird die aktive Information von "nicht-genügt", d. h. eine nichtwirksame Eingabe zu "NULL" gemacht (1232). Während anderenfalls d. h. die aktive Information anders ist, d. h. eine wirksame Eingabe auf "1" belassen wird (1231). Eine derartige Anwendung der Widerspruch-Beseitigungsregel wird nacheinander durchgeführt, wobei mit dem Zeiger p schrittweise (1234, 1235) bis zu der Endposition für jede der Eintragungen fortgeschritten wird, und die aktive Information wird nur für die effektive Eingabe "1" gemacht. Der Zustand des vorerwähnten Beispiels ist in Fig. 10 dargestellt. Beispielsweise wird für die Eintragung "up to" deren aktive Information "0" gemacht.If rule (1) is not satisfied, rule (2) is applied ( 1228 ) and the length for the character array, which refers to the input of the preservation pointer psave and the length for the character array, which depends on the input of the pointer p refers are compared. If the rule (2) is also not enough, then the rule (3) is applied ( 1229 ) and the position start referring to the start position for the preservation pointer psave and the position start referring to the start position of the pointer p, are compared. Then, if one of the rules (1) to (3) is satisfied when applied in this order, the active information is made "not satisfied", ie, a non-effective input is made "NULL" ( 1232 ). While otherwise, ie the active information is different, ie an effective input is left at "1" ( 1231 ). Such an application of the contradiction-eliminating rule is performed one by one, progressing with the pointer p stepwise ( 1234, 1235 ) to the end position for each of the entries, and the active information is made only for the effective input "1". The state of the aforementioned example is shown in FIG . For example, for the entry "up to" their active information is made "0".

Die sich Überlappenden, sogar teilweise überlappenden Positionen mit der Kombination, in welcher sowohl die aktive Information als auch das höchste Vorzugsflag "1" sind, werden festgestellt (1236 bis 1241) und ihre aktive Information wird zu "0" gemacht (1242, 1249). Die Anwendung einer solchen Widerspruch- Beseitigungsregel wird ordnungsgemäß für jede der Eintragungen durchgeführt, wobei der Zeiger p schrittweise (1243, 1248) zu der Endposition vorrückt und die aktive Information eines nicht effektiven Eintrags wird zu "0" gemacht. Folglich wird die aktive Information für "get" und für "up" "0" gemacht, beispielsweise für den Eintrag "get up" (Fig. 10). Für den Eintrag "white", "white house" und "house" werden, da alle das höchste Vorzugsflag "0" haben, selbst wenn sich die Positionen überlappen, deren aktive Informationen bei "1" erhalten bzw. belassen.The overlapping, even partially overlapping positions with the combination in which both the active information and the highest preference flag are "1" are determined ( 1236 to 1241 ) and their active information is made "0" ( 1242, 1249 ) , The application of such a contradiction eliminating rule is properly performed for each of the entries, with the pointer p advancing ( 1243, 1248 ) to the end position and the active information of a non-effective entry being made "0". Consequently, the active information for "get" and for "up" is made "0", for example for the entry "get up" ( Figure 10). For the entry "white", "white house" and "house", since all have the highest preference flag "0", even if the positions overlap, their active information is kept at "1".

Wenn auf diese Weise die Verarbeitung bis unmittelbar vor die Endposition (NULL) durchgeführt worden ist, werden der Inhalt des Eingabepuffers des Eingabe-Interface 1104 und des Puffers 1120 für abgerufene Wörterbuch-Information von dem Ausgabe-Interface durch Interface 1114 an den Parsing-Abschnitt I 1016 abgegeben. Der Inhalt des Puffers 1120 wird nur für den Eintrag abgegeben, für welchen "1" als die aktive Information angezeigt wird. Beispielsweise kann der Inhalt des Eingabepuffers in die Parsing-Eingabedatei 1116 geschrieben werden, während der Inhalt des Informationspuffers 1120 in die Parsing-Wörterbuch-Informationsdatei 1118 geschrieben werden kann. Da in diesem Fall sowohl die aktive Information als auch das höchste Vorzugsflag abgegeben werden, ist der Aufbau der Informationsdatei 1118 mit demjenigen des Informationspuffers 1120 identisch. Es kann jedoch auch so vereinbart sein, daß die aktive Information und das höchste Vorzugsflag nicht abgegeben werden. In this way, when the processing has been performed until just before the end position (NULL), the content of the input buffer of the input interface 1104 and the retrieved dictionary information buffer 1120 from the output interface through interface 1114 to the parsing section I 1016 delivered. The content of the buffer 1120 is delivered only for the entry for which "1" is displayed as the active information. For example, the contents of the input buffer may be written to the parsing input file 1116 , while the content of the information buffer 1120 may be written to the parsing dictionary information file 1118 . In this case, since both the active information and the highest preferred flag are output, the structure of the information file 1118 is identical to that of the information buffer 1120 . However, it may also be so agreed that the active information and the highest preferred flag are not delivered.

Die Erfindung wird nunmehr anhand einer zweiten, in Fig. 11 dargestellten Ausführungsform eines Sprachanalysators beschrieben, welcher bei einer automatischen Übersetzungseinrichtung für Englisch-Japanisch angewendet ist. Diese Ausführungsform hat einen Eingabeabschnitt 2014, an welchem Daten von einer Eingabeeinrichtung 2010 oder einer Vorlagendatei 2012 eingegeben werden. Die Eingabeeinrichtung 2010 weist beispielsweise eine Tastatur mit Zeichentasten, wie alpha/numerische oder Funktions-Tasten und eine optische Zeichenleseeinrichtung auf, um einen auf Papier aufgezeichneten, englischen Text zu lesen. Die Vorlagendatei 2012 ist eine Speichereinrichtung, bei welcher der englische Text auf ein Speichermedium, wie eine Magnetplatte aufgezeichnet wird.The invention will now be described with reference to a second embodiment of a speech analyzer shown in Fig. 11 which is applied to an English-Japanese automatic translation device. This embodiment has an input section 2014 to which data is input from an input device 2010 or a template file 2012 . For example, the input device 2010 has a keyboard with character keys such as alpha / numeric or function keys and an optical character reader to read an English-written text recorded on paper. The template file 2012 is a storage device in which the English text is recorded on a storage medium such as a magnetic disk.

Der Eingabeabschnitt 2014 weist einen Puffer 2014a für eine eingegebene Zeichenreihe auf und speichert den eingegebenen englischen Satz, welcher von der Eingabeeinrichtung 2010 oder der Vorlagendatei 2012 eingegeben worden ist, in den Puffer 2014a. Der Eingabeabschnitt 2014 liest den eingegebenen Satz aus, welcher in dem Puffer 2014a gespeichert ist, und gibt ihn an einen Verarbeitungsabschnitt 2016 ab.The input section 2014 includes a buffer 2014 a is an input character string and stores the inputted English sentence, which from the input device 2010 or the template file 2012 has been entered into the buffer 2014 a. The input section 2014 reads out the input sentence stored in the buffer 2014 a and delivers it to a processing section 2016 .

Der Verarbeitungsabschnitt 2016 ist ein Funktionsabschnitt, welcher die morphologische Analyse für den eingegebenen Satz, der von dem Eingabeabschnitt 2014 abgegeben worden ist, durch Abfragen einer Wörterbuchdatei durchführt. Der Verarbeitungsabschnitt 2016 weist eine Wörterbuch-Informations- Konservierungstabelle 2016a auf und speichert die Information, welche durch Abfragen einer Wörterbuchdatei 2022 oder einer Grundeinheit-Wörterbuchdatei 2026, die später noch beschrieben wird, erhalten worden ist, in die Tabelle 2016a.The processing section 2016 is a function section that performs the morphological analysis for the input sentence that has been issued from the input section 2014 by querying a dictionary file. The processing section 2016 includes a dictionary information preservation table 2016 a and stores the information which has been retrieving a dictionary file 2022 or a basic unit dictionary file 2026, which will be described later is obtained, in the table 2016 a.

Der Verarbeitungsabschnitt 2016 fragt eine Abfrageschlüssel- Zeichenreihe (Wortfolge) als eine Einheit für den Fall ab, daß das Wörterbuch von der Zeichenreihe ausgehend abgefragt wird, welche den eingegebenen Satz darstellt, welche von dem Eingabeabschnitt 2014 eingegeben worden ist. Die Abfrage- Zeichenreihe wird ordnungsgemäß ausgehend von dem ersten Zeichen der Zeichenreihe, welche den eingegebenen Satz darstellt, entsprechend einer vorherbestimmten Abfrageregel abgefragt. Beispielsweise wird der eingegebene Satz von dem oberen Ende des Satzes an ordnungsgemäß mittels Abgrenzungen, beispielsweise Zwischenräumen, Kommata, usw. aufgeteilt und die aufgeteilten Zeichenreihen werden als die Abfrageschlüssel- Zeichenreihe verwendet. In diesem Fall werden Zeichenreihen, welche Einheiten wie m, k, m/s ausdrücken, als Abfrageschlüssel-Zeichenreihen ausgebildet. Der Verarbeitungsabschnitt 216 sendet die Abfrageschlüssel-Zeichenreihe, welche aus der Zeichenreihe abgefragt worden ist, welche den eingegebenen Satz darstellt, an den Wörterbuch- Abfrageabschnitt 2020 ab.The processing section 2016 retrieves a query key string as a unit in case that the dictionary is retrieved from the character string representing the input sentence input from the input section 2014 . The query string is properly retrieved from the first character of the string representing the input sentence according to a predetermined query rule. For example, the input sentence from the upper end of the sentence is properly divided by boundaries such as spaces, commas, etc., and the divided character series are used as the query key string. In this case, character strings expressing units such as m, k, m / s are formed as interrogation key character strings. The processing section 216 sends the interrogation key string which has been interrogated from the character string representing the input sentence to the dictionary interrogation section 2020 .

Der Wörterbuchabfrage-Abschnitt 2020 fragt eine Wörterbuchdatei 2022 auf der Basis der Abfrage-Zeichenreihen ab, welche von dem Verarbeitungsabschnitt 2016 abgegeben worden sind. In der Wörterbuch- Datei 2022 werden Eintrag- und Grammatik-Informationen, wie ein Sprachteil, so gespeichert, wie in Fig. 12 dargestellt ist. Wenn ein Eintrag in der Wörterbuchdatei 2022 vorliegt, liest der Wörterbuch-Abfrageabschnitt 2020 die Sprachteil-Information, usw. dieses Eintrags aus und gibt sie an den Verarbeitungsabschnitt 2016 ab. Wenn kein Eintrag in der Wörterbuchdatei 2022 als Ergebnis der Abfrage der Datei 2022 vorliegt, gibt der Wörterbuch-Abfrageabschnitt 2022 diese Situation an den Verarbeitungsabschnitt 2016 ab.The dictionary inquiry section 2020 retrieves a dictionary file 2022 on the basis of the query character strings which have been output from the processing section 2016 . In the dictionary file 2022 , entry and grammatical information such as a voice part is stored as shown in FIG . If there is an entry in the dictionary file 2022 , the dictionary query section 2020 reads out the speech part information, etc. of that entry and delivers it to the processing section 2016 . If there is no entry in the dictionary file 2022 as a result of the query of the file 2022 , the dictionary query section 2022 returns this situation to the processing section 2016 .

Der Verarbeitungsabschnitt 2016 speichert die Sprachteil- Information usw., welche von dem Abschnitt 2020 abgefragt worden ist, in eine Wörterbuch-Informations-Konservierungstabelle 216a. Wenn kein Eintrag für die Abfrage-Zeichenreihe in der Datei 2022 vorliegt, gibt der Verarbeitungsabschnitt 2016 die Abfrage-Zeichenreihe an einen Einheiten-Erkennungsabschnitt 2024 ab. The processing section 2016 stores the language part information, etc., which has been requested by the section 2020 into a dictionary information preservation table 216 a. If there is no entry for the query string in the file 2022 , the processing section 2016 outputs the query string to a unit recognition section 2024 .

Der Abschnitt 2024 fragt eine Grundeinheit-Wörterbuchdatei 2026 auf der Basis der Abfrage-Zeichenreihe ab, welche von dem Verarbeitungsabschnitt 2016 abgegeben worden ist. Die Grundeinheiten-Einträge werden in der Datei 2026 gespeichert, wie in Fig. 13 dargestellt ist. Wenn der Grundeinheiten- Eintrag in der Datei 2026 vorhanden ist, liest der Einheiten-Erkennungsabschnitt 2024 den Grundeinheit-Eintrag aus. Wenn kein Eintrag in der Datei 2026 vorliegt, wird die Abfrageschlüssel-Zeichenreihe in eine Anzahl Zeichenreihen aufgeteilt, was später noch beschrieben wird, und die Datei 2026 wird mehrmals abgefragt. Wenn Grundeinheiten-Einträge bei mehrmaliger Abfrage der Datei 2026 auftreten, werden eine Anzahl Einheiten-Informationen von den Grundeinheiten- Einträgen erhalten. Wenn kein Grundeinheiten-Eintrag bei einer der vielen Abfragen auftritt, wird eine Information erhalten, welche anzeigt, daß dieser nicht in dem Wörterbuch registriert ist.The section 2024 retrieves a base unit dictionary file 2026 on the basis of the query character string which has been output from the processing section 2016 . The base unit entries are stored in file 2026 , as shown in FIG . If the primitive entry is present in the file 2026 , the unit detection section 2024 reads out the primitive entry. If there is no entry in the file 2026 , the interrogation key string is divided into a number of character strings, which will be described later, and the file 2026 is interrogated a plurality of times. If primitive entries occur on multiple queries of file 2026 , a number of unit information is obtained from the primitive entries. If no primitive entry occurs on any of the many queries, information indicating that it is not registered in the dictionary is obtained.

Der Einheiten-Erkennungsabschnitt 2024 gibt den Grundeinheiten-Eintrag, eine zusammengesetzte Einheiteninformation und Information, die anzeigt, daß das Wort nicht in dem Speicher registriert ist, an den Verarbeitungsabschnitt 2016 ab. Dieser (2016) speichert diese Informationen, die von dem Abschnitt 2024 eingegeben worden sind, in der Tabelle 2016a. Die Tabelle 2016a speichert den Eingang für die Abfrageschlüssel-Zeichenreihen und Grammatik-Information, wie einen Sprachteil, um diesen zu konservieren, welcher durch Abfragen der Datei 2022 oder der Datei 2026 bezüglich der Abfrageschlüssel-Zeichenreihe erhalten worden ist. Nachdem die Daten in der Tabelle 2016a gespeichert worden sind, gibt der Verarbeitungsabschnitt 2016 diese Daten zusammen mit dem eingegebenen Satz an das Ausgabe-Interface 2018 ab. Das Interface 2018 gibt den eingegebenen Satz und die Daten für die morphologische Analyse, welche von dem Verarbeitungsabschnitt 2016 abgegeben worden ist, an eine Ausgabeeinheit 2030, wie einen Drucker oder eine Anzeigeeinheit oder an eine Speicherdatei 2032, wie eine Magnetplatte, ab. The unit recognizing section 2024 outputs the basic unit entry, composite unit information and information indicating that the word is not registered in the memory to the processing section 2016 . This ( 2016 ) stores this information input from the section 2024 in the table 2016 a. The table 2016 a stores the input for the query key character strings and grammar information, such as a part of speech in order to conserve this which the retrieval key character row has been obtained with respect to by querying the file 2022 or the file 2026th After the data has been stored in the table 2016 a, the processing section 2016 analyzes this data together with the input set to the output interface 2018th The interface 2018 outputs the input sentence and the morphological analysis data output from the processing section 2016 to an output unit 2030 such as a printer or a display unit or a memory file 2032 such as a magnetic disk.

Andererseits ist es auch möglich, den eingegebenen Satz und die Daten der morphologischen Analyse, welche von dem Verarbeitungsabschnitt 2016 abgegeben worden sind, direkt in eine (nicht dargestellte) Parsing-Einrichtung einzugeben, um eine Syntaxanalyse für den eingegebenen Satz in der Parsing-Einrichtung durchzuführen und um ferner basierend auf der Syntaxanalyse einen übersetzten Satz vorzubereiten. Der Steuerabschnitt 2028 dient zum Steuern der Arbeitsweise jedes der Funktionsabschnitte in der erläuterten Einrichtung und kann vorteilhafterweise in Form eines Mikroprozessors ausgeführt sein.On the other hand, it is also possible to input the input sentence and the morphological analysis data which have been output from the processing section 2016 directly into a parsing means (not shown) for parsing the input sentence in the parsing means and further to prepare a translated sentence based on the syntax analysis. The control section 2028 serves to control the operation of each of the functional sections in the illustrated device and may advantageously be implemented in the form of a microprocessor.

Die Arbeitsweise der erläuterten Einrichtung wird nunmehr anhand des in Fig. 15 dargestellten Flußdiagramms erläutert. Zuerst wird der eingegebene englische Satz aus der Eingabeeinheit 2010 oder aus einer Vorlagendatei 2012 in den Eingabeabschnitt 2014 gelesen (2100). Der in den Eingabeabschnitt 2014 eingelesene Satz wird in dem Puffer 2014a für eine eingegebene Zeichenreihe gespeichert. Der Satz in dem Puffer 2014a wird dann ausgelesen und an den Verarbeitungsabschnitt 2016 abgegeben. Wenn in den Verarbeitungsabschnitt 2016 der Satz eingegeben wird, wird die Wörterbuch- Abfrageeinheit ausgeklammert (2102). Das heißt, die Zeichenfolge (Wortfolge), welche den eingegebenen Satz darstellt, wird gemäß einer vorherbestimmten Regel in eine Abfrage-Schlüssel- Zeichenreihe als die Einheiten zum Abfragen der Wörterbuchdatei 2022 oder der Grundeinheiten-Wörterbuchdatei 2026 aufeinanderfolgend, ausgehend von dem oberen Ende der Zeichenreihe aufgeteilt. Dann wird beurteilt, ob die aufgeteilte Abfragezeichenreihe vorhanden ist oder nicht (2104); wenn sie vorhanden ist, wird die Abfrage-Zeichenreihe an den Wörterbuch-Abfrageabschnitt 2020 abgegeben.The operation of the illustrated device will now be explained with reference to the flowchart shown in FIG . First, the input English sentence is read from the input unit 2010 or from a template file 2012 into the input section 2014 ( 2100 ). The read-in the input section 2014 set is stored in the buffer 2014 for a an input character string. The sentence in the buffer 2014 a is then read out and delivered to the processing section 2016 . When the sentence is entered into the processing section 2016 , the dictionary query unit is excluded ( 2102 ). That is, the string (word string) representing the input sentence is sequenced according to a predetermined rule into a query key string as the units for retrieving the dictionary file 2022 or the constituent dictionary dictionary file 2026 , starting from the upper end of the character string divided up. Then, it is judged whether the split query string series exists or not ( 2104 ); if present, the query string is submitted to the dictionary query section 2020 .

Wenn die Abfrage-Zeichenreihe an den Abschnitt 2020 abgegeben ist, fragt dieser (2020) die Datei 2022 bezüglich der Abfrageschlüssel-Zeichenreihe ab (2106). Es wird dann beurteilt, ob diese Zeichenreihe in dem Eintrag der Wörterbuchdatei 2022 vorhanden ist oder nicht, wie in Fig. 12 dargestellt ist; wenn der Eintrag vorhanden ist, wird die Grammatikinformation, wie ein Sprachteil, der in der Datei 2022 gespeichert ist, ausgelesen, und die ausgelesenen Daten werden an den Verarbeitungsabschnitt 2016 gesendet und in der Tabelle 2016a gespeichert (2110). Dann wird auf den Schritt 2102 zurückgekehrt, und die Wörterbuch-Abfrageeinheit wird wieder ausgeschaltet.When the query string is submitted to section 2020 , it ( 2020 ) queries file 2022 for the query key string ( 2106 ). It is then judged whether or not this character string is present in the entry of the dictionary file 2022 as shown in Fig. 12; if the entry is present, the grammar information such as a speech item that is stored in the file 2022 is read, and the read data is sent to the processing section 2016 and stored in the table 2016 a (2110). Then, step 2102 is returned, and the dictionary retrieval unit is turned off again.

Wenn kein Eintrag in der Datei 2022 vorhanden ist, sendet der Abschnitt 2020 die Abfrageschlüssel-Zeichenfolge an den Verarbeitungsabschnitt 2016 zurück. Der Abschnitt 2016 gibt dann die Abfrageschlüssel-Zeichenreihe an den Erkennungsabschnitt 2024 ab, in welchem die Einheiten-Erkennung durchgeführt wird (2112).If there is no entry in the file 2022 , the section 2020 returns the query key string to the processing section 2016 . The section 2016 then outputs the interrogation key string to the recognition section 2024 in which the unit recognition is performed ( 2112 ).

Für den Fall, daß die Abfrage-Zeichenreihe von dem Abschnitt 2020 gewöhnliche Worte, wie Hauptwörter und Verben aufweist, da sie meistens die Einträge in der Wörterbuch-Datei 2022 sind, wird Grammatik-Information, wie ein Sprachteil, aus der Datei 2022 gelesen, und die Daten werden an den Verarbeitungsabschnitt 2016 abgegeben und in der Tabelle 2016a aufgezeichnet. Wie oben beschrieben, werden Einträge für gewöhnliche Worte, wie Hauptworte und Verben, gebildet, aber es werden keine Einträge für die Zeichenreihe gebildet, welche die Einheiten in der Datei 2022 ausdrückt. Folglich wird für den Fall, daß die Abfrageschlüssel-Zeichenreihe eine Zeichenreihe ist, die eine Einheit, wie kg oder m/s ausdrückt, da dies keinen Eintrag in der Datei 2022 darstellt, in dem Ablauf zu dem Schritt 2112 für eine Einheit übergegangen.In the event that the query string from the section 2020 has ordinary words such as nouns and verbs, since they are mostly the entries in the dictionary file 2022 , grammar information, such as a speech part, is read from the file 2022 , and the data will be submitted to processing section 2016 and recorded in Table 2016 a. As described above, entries for ordinary words such as main words and verbs are formed, but no entries for the character string expressing the units in the file 2022 are formed. Thus, in the event that the query key string is a character string expressing a unit such as kg or m / s, since this is not an entry in file 2022 , in the process, step 2112 for one unit is proceeded.

Die Einheiten-Erkennungsoperation beim Schritt 2112 wird anhand von Fig. 16 erläutert. Wenn die Abfrageschlüssel- Zeichenreihe, für welche kein Eintrag in der Datei 2022 bei dem Abfragen vorhanden ist, von dem Verarbeitungsabschnitt 2016 an den Abschnitt 2024 abgegeben wird, wird der Zeiger P bei dem Zeichen an dem oberen Ende der Abfrageschlüssel-Zeichenreihe in dem Abschnitt 2024 gesetzt (2200).The unit recognition operation in step 2112 will be explained with reference to FIG . When the interrogation key string for which there is no entry in the file 2022 in the interrogation is delivered from the processing section 2016 to the section 2024 , the pointer P becomes the character at the upper end of the interrogation key string in the section 2024 set ( 2200 ).

Der Abschnitt 2024 fragt dann die Grundeinheiten-Wörterbuchdatei 2026 für die Zeichenreihe ab, welche von dem Zeichen aus beginnt, an welchem der Zeiger P gesetzt ist (2201).The section 2024 then queries the base unit dictionary file 2026 for the character string starting from the character at which the pointer P is set ( 2201 ).

Bei diesem Abfragen wird dann beurteilt, ob die Grundeinheit, für welche der Eintrag in der Datei 2026 vorhanden ist, als eine vollständige Zeichenreihe in der Zeichenreihe, die von dem Zeichen an beginnt, an welchem der Zeiger P gesetzt ist, aufscheint oder nicht, und ob sie von dem Zeichen an gestartet wird, an welchem der Zeiger P gesetzt ist, oder nicht. Es wird nämlich abgefragt, ob eine Zeichenreihe, die eines oder eine Anzahl Zeichen aufweist, wobei bei dem Zeichen begonnen wird, an welchem der Zeiger gesetzt ist, mit irgendeiner der Grundeinheiten, für welche der Eintrag in der Datei 2026 vorhanden ist, übereinstimmt oder nicht. Beispielsweise sind für den Fall, daß die Zeichen, bei welchen der Zeiger P gesetzt ist, k, m, s, usw. sind, Einträge in der Datei 2026 für diese einzelnen Zeichen vorhanden, die bei dem Zeichen beginnen, bei welchem der Zeiger P gesetzt ist, wie in Fig. 13 dargestellt ist.In this polling, it is then judged whether the basic unit for which the entry exists in the file 2026 appears as a complete character string in the character string starting from the character at which the pointer P is set, and whether it is started from the character at which the pointer P is set or not. Namely, it is inquired whether or not a character string having one or a number of characters, starting at the character at which the pointer is set, coincides with any one of the constituent units for which the item is present in the file 2026 , For example, in the case where the characters in which the pointer P is set are k, m, s, etc., there are entries in the file 2026 for these individual characters beginning at the character at which the pointer P is set, as shown in Fig. 13.

Der Einheiten-Erkennungsabschnitt 2024 beurteilt, ob die Einträge in der Datei 2026 als Ergebnis der Abfrage an der Datei 2026 vorhanden sind oder nicht (2204). Wenn der Eintrag vorhanden ist, rückt der Zeiger P um die Länge der erkannten Grundeinheit vor (2208). Folglich wird der Zeiger in dem Fall, daß die Grundeinheit k, m, s usw. ist, der Zeiger P um ein Zeichen vorgerückt und dann bei dem nächsten Zeichen in der Abfrageschlüssel-Zeichenreihe gesetzt.The unit recognizing section 2024 judges whether the entries in the file 2026 are available as a result of the query to the file in 2026 or not (2204). If the entry exists, the pointer P advances by the length of the recognized primitive ( 2208 ). Consequently, in the case where the basic unit is k, m, s, etc., the pointer is advanced by one character and then set at the next character in the inquiry key string.

Der Abschnitt 2024 beurteilt dann, ob die Zeichenreihe, die von dem Zeichen aus startet, an welchem der Zeiger P gesetzt ist, weiter vorhanden ist oder nicht (2208). Für den Fall, daß eine solche Zeichenreihe weiterhin vorhanden ist, wird auf den Schritt 2202 zurückgegangen, bei welchem die Datei 2026 wieder für die Zeichenreihe abgefragt wird, die von dem Zeichen aus startet, auf welches der Zeiger P gesetzt ist. Dann beurteilt der Abschnitt, ob der Eintrag in der Grundeinheit aufgrund der Abfrage der Datei 2026 vorhanden ist oder nicht; wenn der Eintrag vorhanden ist, rückt der Zeiger P um die Länge der erkannten Grundeinheit vor.The section 2024 then judges whether or not the character string starting from the character at which the pointer P is set exists ( 2208 ). In the event that such a character string still exists, it returns to step 2202 where the file 2026 is again retrieved for the character string starting from the character to which the pointer P is set. Then, the section judges whether or not the entry in the basic unit exists due to the retrieval of the file 2026 ; if the entry is present, the pointer P advances by the length of the detected basic unit.

Wenn beim Schritt 2208 die Zeichenreihe, die von dem Zeichen aus startet, bei welchem der Zeiger P gesetzt ist, nicht mehr vorhanden ist, ist das Abrufen an der Datei 2026 beendet worden, d. h. das Erkennen für die zusammengesetzte Einheit ist erfolgreich gewesen.If, at step 2208, the character string starting from the character at which the pointer P is set is no longer present, the fetch to the file 2026 has been completed, ie, the recognition for the composite entity has been successful.

Beispielsweise ist für den Fall, daß die Abrufschlüssel-Zeichenreihe, die an den Abschnitt 2024 abgegeben worden ist, km/s ist, welches eine Einheit darstellt, der Eintrag in der Datei 2026 nicht vorhanden, da km/s für sich eine komplizierte Einheit ist. Der Zeiger P wird dann zuerst auf k gesetzt (2200), und k wird aus der Grundeinheiten-Wörterbuch- Datei 2026 abgerufen, um das Vorhandensein des Eintrags zu bestätigen (2202). Dann wird der Zeiger P auf m gesetzt (2206), und m wird aus der Datei 2026 abgerufen (2202), um das Vorhandensein des Eintrags in derselben Weise zu bestätigen. Da der Einheiten-Erkennungsabschnitt 2024 einen Schrägstrich(/), einen ausgezogenen Kreis (○) usw. als einen Teil einer Einheit betrachtet, wird der Zeiger P als nächstes auf s gesetzt, wobei er "/" in km/s überspringt (2206). Dann wird s aus der Datei 2026 abgerufen, um das Vorhandensein des Eintrags in derselben Weise zu bestätigen (2202). Da dies für jede der Einheiten k, m und s beim Abrufen in der Datei 2026 vorhanden gewesen ist, wird nunmehr beurteilt, daß km/s eine Zeichenfolge ist, die eine Einheit ausdrückt. Auf diese Weise wird für den Fall, daß Einträge in der Datei 2026 für alle die Zeichen vorhanden sind, welche die Abrufschlüssel-Zeichenfolge bilden, oder für den Fall, daß Einträge in der Datei 2026 für alle die Zeichen außer den Symbolen wie Schrägstrich und ausgezogener Kreis, usw. vorhanden sind, die ohnehin als ein Teil der Einheit betrachtet werden, nunmehr beurteilt, daß die Abrufschlüssel-Zeichenfolge eine Zeichenfolge ist, welche die Einheit ausdrückt.For example, in the case where the fetch key string outputted to the section 2024 is km / s, which is a unit, the entry in the file 2026 is absent because km / s is a complicated unit per se , The pointer P is then first set to k ( 2200 ), and k is retrieved from the base unit dictionary file 2026 to confirm the presence of the entry ( 2202 ). Then the pointer P is set to m ( 2206 ) and m is retrieved from the file 2026 ( 2202 ) to confirm the presence of the entry in the same way. Since the unit recognition section 2024 considers a slash (/), a solid circle (○), etc. as a part of a unit, the pointer P is set to s next, skipping "/" in km / s ( 2206 ) , Then s is retrieved from the file 2026 to confirm the presence of the entry in the same manner ( 2202 ). Since this was present for each of the units k, m, and s in retrieving in the file 2026 , it is now judged that km / s is a character string expressing a unit. In this way, in the event that entries are present in the file 2026 for all the characters forming the fetch key string, or in the event that entries in the file 2026 are for all the characters except the symbols such as slash and solid Circle, etc., which are considered to be part of the unit anyway, now judges that the fetch key string is a string expressing the unit.

Wenn der Einheiten-Erkennungsabschnitt 2024 das Abrufen in der Datei 2026 beendet hat und das Ziel, die zusammengesetzte Einheit zu erkennen, erreicht hat, gibt er die auf diese Weise erhaltene Einheiteninformation an den Verarbeitungsabschnitt 2016 ab, welche dann in der Tabelle 2016a gespeichert wird (2210). Die Einheitenerkennung ist dann folglich beendet.When the unit recognizing section 2024 finishes retrieving the file 2026 and has reached the destination of recognizing the composite unit, it sends the unit information thus obtained to the processing section 2016 , which is then stored in the table 2016 a ( 2210 ). The unit detection is then terminated.

Wenn beim Schritt 2204 kein Eintrag in der Datei 2026 als Ergebnis des Abfragens dieser Datei 2026 für die Zeichenfolge, die von dem Zeichen aus startet, an welchem der Zeiger P gesetzt ist, vorhanden ist, bedeutet dies, daß die Zeichenfolge nicht als eine Grundeinheit oder eine zusammengesetzte Einheit erkannt werden kann. Daher gibt der Abschnitt 2024 die Information ab, die anzeigt, daß die Zeichenfolge ein Wort ist, das nicht in dem Wörterbuch registriert ist d. h. sendet eine Information, die anzeigt, daß das Wort die Einheit nicht ausdrückt, zurück an den Verarbeitungsabschnitt 2016, welcher in der Tabelle 2016a des Verarbeitungsabschnitts 2016 konserviert wird, wodurch dann die Einheitenerkennung beendet ist.If, in step 2204, there is no entry in file 2026 as a result of querying this file 2026 for the string starting from the character at which pointer P is set, it means that the string is not a primitive or a composite unit can be recognized. Therefore, the section 2024 outputs the information indicating that the character string is a word that is not registered in the dictionary, that is, information indicating that the word does not express the unit is sent back to the processing section 2016 included in the table 2016 a of the processing section 2016 is conserved, whereby the unit detection is then completed.

Wenn die Einheitenerkennung (2112) beendet worden ist, wird in dem Flußdiagramm der Fig. 15 auf Schritt 2101 zurückgegangen, und die Ausklammerung der Wörterbuch-Bezugseinheit wird durch den Verarbeitungsabschnitt 2016 wieder gesteuert. Nach einem Ausklammern der Bezugseinheit beurteilt der Abschnitt 2016, ob die herausgenommene Einheit noch vorhanden ist oder nicht (2104). Wenn die herausgeschnittene Einheit, d. h. die Abrufschlüssel-Zeichenfolge nicht mehr vorhanden ist, gibt sie die in der Tabelle 2016a gespeicherte Information mittels des Ausgabe-Interfaces 2018 an die Ausgabeeinheit ab (2114). Die Syntax-Analyse bzw. das sogenannte Parsing für den eingegebenen Satz ist folglich abgeschlossen. Wie oben bezüglich dieser Ausführungsform beschrieben worden ist, wird der eingegebene englische Satz in Abrufschlüsselzeichenfolgen unterteilt und zuerst aus einer gewöhnlichen Wörterbuchdatei 2022 abgerufen; wenn kein Eintrag in der Datei 2022 vorhanden ist, wird eine Einheiten-Erkennung durchgeführt. Bei der Einheiten-Erkennung wird die Abrufschlüssel- Zeichenfolge aufgeteilt und durch den Zeiger P angezeigt, und die Grundeinheiten-Wörterbuchdatei 2026 wird bei der jeweiligen aufgeteilten Zeichenreihe abgerufen. Das, was in der Datei 2026 aufgezeichnet ist, oder das, was aus einer Folge von Reihen zusammengesetzt ist, die in der Datei 2026 aufgezeichnet sind, wird daraufhin beurteilt, ob es Zeichenreihen sind, welche Einheiten ausdrücken.When the unit recognition ( 2112 ) has been completed, the flowchart of FIG. 15 is returned to step 2101 , and the exclusion of the dictionary reference unit is controlled again by the processing section 2016 . After disabling the reference unit, the section 2016 judges whether or not the removed unit is still present ( 2104 ). When the cut-out unit, that is, the retrieval key string is no longer present, they are in the table 2016 a stored information by means of the output interface 2018 from the output unit (2114). The syntax analysis or so-called parsing for the input sentence is therefore completed. As described above with respect to this embodiment, the input English sentence is divided into retrieval key strings and retrieved first from an ordinary dictionary file 2022 ; if there is no entry in the file 2022 , unit discovery is performed. In the unit recognition, the fetch key string is divided and indicated by the pointer P, and the constituent dictionary file 2026 is fetched at the respective divided character string. That which is recorded in the file 2026 , or which is composed of a series of series recorded in the file 2026 , is judged whether there are character strings expressing units.

Da es folglich möglich ist, eine Einheiten-Erkennung sogar für die Zeichenfolge durchzuführen, die eine komplizierte Einheit ausdrückt, indem Grundeinheiten, die in der Datei 2026 gespeichert sind kombiniert werden, kann das sogenannte Parsing durchgeführt werden, das einem vielseitigen Ausdrücken von Einheiten entspricht. Da es außerdem nur notwendig ist, da die Datei 2026 Grundeinheiten, wie beispielsweise k, m, s, . . . usw. speichert und komplizierte Einheiten, die aus diesen zusammengesetzt sind, wie beispielsweise km, ○ km/s, usw. nicht gespeichert zu werden brauchen, kann die Kapazität der Wörterbuchdatei reduziert werden.Therefore, since it is possible to perform unit recognition even for the string expressing a complicated unit by combining basic units stored in the file 2026 , so-called parsing corresponding to versatile expressions of units can be performed. Moreover, since it is only necessary that the file 2026 has basic units such as k, m, s,. , , etc., and complicated units composed of them, such as km, ○ km / s, etc. need not be stored, the capacity of the dictionary file can be reduced.

Anhand von Fig. 18 wird der Gesamtaufbau einer dritten Ausführungsform beschrieben, in welcher der Sprachanalysator bei einer automatischen Übersetzungseinrichtung für Englisch-Japanisch angewendet ist.Referring to Fig. 18, the overall structure of a third embodiment in which the speech analyzer is applied to an English-Japanese automatic translation apparatus will be described.

Diese Ausführungsform weist einen Eingabeabschnitt 3010 auf, über welchen ein englischer Text 3012, welcher ins Japanische zu übersetzen ist, eingegeben wird. Der Eingabeabschnitt 3010 kann beispielsweise eine Tastatur mit Zeichentasten, wie alphanumerischen oder Funktions-Tasten, einen optischen Zeichenleser (OCR) zum Lesen des auf Papier aufgezeichneten englischen Textes und/oder eine Datei-Speichereinrichtung aufweisen, um den englischen Text zu lesen, der auf dem Speichermedium, wie einer Magnetplatte aufgezeichnet ist. Der von dem Abschnitt 3010 eingegebene, englische Text wird in den Vorredigierabschnitt 3014 gelesen, in welchem eine Vorbehandlung für die Übersetzung durchgeführt wird. In diesem Fall werden hauptsächlich ein Erkennen für den Satz und die Verarbeitung von unbekannten Worten durchgeführt. Dies fungiert dann als ein Teil einer morphologischen Analyse. Die englischen Daten werden nach der Vorredigierung zusammen mit der bei der Vorredigierung erhaltenen Information an einen Morphom-Analyseabschnitt 3016 übertragen. Der Abschnitt 3016 analysiert die Morpheme des englischen Satzes, wobei der Satz durch Abrufen eines Wort-Wörterbuchs 3018 unterteilt wird, führt verschiedene Arrangements durch, wie beispielsweise eine Verarbeitung für unbekannte Worte, für ein Hauptwort, für einen Zeitausdruck, für Zahlen usw. und führt die Bearbeitung für den ganzen Satz durch. Die morphologischen Analyseregeln werden in einer entsprechenden Regeldatei 3036 gespeichert.This embodiment has an input section 3010 through which an English text 3012 to be translated into Japanese is input. The input section 3010 may comprise , for example, a keyboard with character keys such as alphanumeric or function keys, an optical character reader (OCR) for reading the English text recorded on paper, and / or a file memory device to read the English text which is printed on the Storage medium as a magnetic disk is recorded. The English text input from the section 3010 is read into the pre-editing section 3014 in which pretreatment for the translation is performed. In this case, recognition is mainly performed for the sentence and the processing of unknown words. This then acts as part of a morphological analysis. The English data is transmitted to a morpheme analysis section 3016 after the pre-editing together with the information obtained in the pre-editing. Section 3016 analyzes the morphemes of the English sentence, which sentence is subdivided by retrieving a word dictionary 3018 , performs various arrangements such as unknown word processing, noun, time-out, numbers, etc., and performs editing for the whole sentence. The morphological analysis rules are stored in a corresponding rule file 3036 .

Die englischen Daten werden nach der morphologischen Analyse zusammen mit der Wörterbuch-Information, die bei der morphologischen Analyse erhalten worden ist, in einem Parsing-Abschnitt I 3020 gespeichert. Der Abschnitt I 3020 ist ein Funktionsabschnitt, welcher die Oberflächenschicht-Struktur für den Satz dadurch analysiert, daß eine Grammatikregel bei den englischen Daten angewendet wird, und sie findet alle Möglichkeiten im Hinblick auf die Struktur heraus.The English data is stored in a parsing section I 3020 after the morphological analysis together with the dictionary information obtained in the morphological analysis. Section I 3020 is a functional section that analyzes the surface layer structure for the sentence by applying a grammar rule to the English data, and finds out all the possibilities in terms of structure.

Die englischen Daten, welche der morphologischen Analyse in dem Abschnitt I 3020 unterzogen worden sind, werden zusammen mit der morphologisch analysierten Information an einen Parsing-Abschnitt II 3022 übertragen. In diesem Abschnitt wird eine Lösung durch Anwenden einer strukturellen Beschreibung ausgewählt, welche auf dem Ergebnis der morphologischen Analyse im Hinblick auf die Oberflächenschicht-Struktur durch die syntaktische Analyse I beruht, wodurch ein plausibler Parsing-Baum für den englischen Satz vorbereitet wird, um dessen Struktur zu formen. Dieser Parsing- oder Analyseregeln werden in der Regeldati 3036 gespeichert.The English data which has been subjected to the morphological analysis in section I 3020 is transmitted to a parsing section II 3022 together with the morphologically analyzed information. In this section, a solution is selected by applying a structural description based on the result of the morphological analysis with respect to the surface-layer structure by syntactic analysis I, thereby preparing a plausible parsing tree for the English sentence to its structure to shape. These parsing or analysis rules are stored in the rule data 3036 .

Die englischen Daten, welche der entsprechenden Analyse unterzogen worden sind, werden als die Daten für den sogenannten Parsing-Baum an einen Struktur-Umwandlungsabschnitt 3024 übertragen. Im Abschnitt 3024 wird ein Strukturbaum eines entsprechenden japanischen Satzes aus dem Strukturbaum heraus vorbereitet, welcher eine Zwischenstruktur des englischen Satzes ist, um den japanischen Satz in eine dem japanischen unterliegenden Struktur umzuwandeln, aus welcher dann ein japanischer Satz leicht übersetzt werden kann. Die Strukturbaum-Daten, welche die dem japanischen zugrundeliegende Struktur anzeigen, welche auf diese Weise transformiert worden ist, werden an einen Übersetzungs-Bildungsabschnitt 3026 abgegeben, in welchem dann der übersetzte Satz gebildet wird. Dies ist ein Funktionsabschnitt, um einen japanischen Satz aus der Struktur des japanischen Satzstruktur-Baums zu bilden.The English data which has been subjected to the corresponding analysis is transmitted as the data for the so-called parsing tree to a texture conversion section 3024 . In section 3024 , a structure tree of a corresponding Japanese sentence is prepared from the structure tree, which is an intermediate structure of the English sentence, to convert the Japanese sentence into a Japanese underlying structure, from which then a Japanese sentence can be easily translated. The structure tree data indicating the Japanese underlying structure which has been transformed in this way is delivered to a translation forming section 3026 in which the translated sentence is then formed. This is a functional section to form a Japanese sentence from the structure of the Japanese sentence structure tree.

Die japanischen Satzdaten, die als ein übersetzter Satz vorbereitet sind, d. h. übersetzte Daten werden an einen Vorredigierabschnitt 3030 abgegeben. In dem Abschnitt 3030 werden die Übersetzungsdaten modifiziert, die aus einem Wörterbuch 3018 mit Hilfe von Informationen abgerufen werden, welche in der Übersetzung benutzt worden ist, um einen natürlicheren japanischen Satz zu vervollständigen. Die Daten für den japanischen Satz werden an einen Ausgabeabschnitt 3032 übertragen und von diesem als ein übersetzter japanischer Satz 3034 abgegeben. Der Ausgabeabschnitt 3032 enthält beispielsweise einen Drucker, ein Display und/oder eine Dateispeichereinrichtung, wie beispielsweise eine Magnetplatte.The Japanese sentence data prepared as a translated sentence, that is, translated data is delivered to a pre-editing section 3030 . In section 3030 , the translation data retrieved from a dictionary 3018 using information that has been used in the translation to complete a more natural Japanese sentence is modified. The data for the Japanese sentence is transferred to and outputted from an output section 3032 as a translated Japanese sentence 3034 . The output section 3032 includes, for example, a printer, a display, and / or a file storage device, such as a magnetic disk.

Der Fluß einer Reihe von Übersetzungsvorgängen wird durch einen Steuerabschnitt 3038 gesteuert, welcher die Steuerung für die gesamte Einrichtung regelt. Das Wort-Wörterbuch 3018 speichert in dieser Ausführungsform Wörterbuchdaten für englische und japanische Wörter, in dem verschiedene Informationen beschrieben werden, wie eine verbindende Beziehung, d. h. gleichzeitig vorhandene Beziehungen, Bedeutungen, Plural- oder Singularformen, Sprachteile usw., und zwar zusätzlich zu dem Vokabular. Ferner speichert die Datei 3036 Regeldaten für die morphologische und die syntaktische Analyse.The flow of a series of translation operations is controlled by a control section 3038 which controls the control for the entire device. The word dictionary 3018 in this embodiment stores dictionary data for English and Japanese words describing various information such as a connecting relationship, ie, concurrent relationships, meanings, plural or singular forms, language parts, etc., in addition to the vocabulary , Further, the file 3036 stores morphological and syntactic analysis rules data.

Der Steuerabschnitt 3038 ist mit einem Bedienungs-Anzeigeabschnitt 3040 verbunden. Der Abschnitt 3014 weist beispielsweise Bedienungstasten, wie eine Übersetzungs-Anzeigetaste oder eine Cursortaste auf, um verschiedene Informationen von dem Operator an die entsprechende Einrichtung zu geben, weist ein Display oder eine Anzeigeeinrichtung auf, die visuell den eingegebenen englischen Text, den japanischen Text als Ergebnis einer Übersetzung, Zwischendaten, wie eine Wörterbuch- Information usw. sowie verschiedene Anzeigen für den Operator auf. Viele dieser eine Operation anzeigenden Funktionen können so ausgebildet sein, daß sie in einer Tastatur enthalten sind, falls sie in dem Eingabeabschnitt 310 angeordnet ist, oder an einer Anzeige, wenn sie an dem Ausgabeabschnitt 3032 angeordnet ist.The control section 3038 is connected to an operation display section 3040 . For example, the section 3014 has operation buttons such as a translation display button or a cursor button to give various information from the operator to the corresponding device, has a display or display that visually displays the input English text, the Japanese text as a result a translation, intermediate data such as dictionary information, etc. as well as various indications for the operator. Many of these operation-indicating functions may be formed to be included in a keyboard if located in the input section 310 or to be displayed when located on the output section 3032 .

In Fig. 17 ist ein Beispiel eines detaillierten Aufbaus für die Verarbeitung von Zahlen in dem morphologischen Analyseabschnitt 3016 dargestellt. Der Abschnitt 3016 weist natürlich auch einen anderen Funktionsabschnitt auf, aber es sind die Teile, welche unmittelbar für das Verständnis der Erfindung erforderlich sind, hier ebenfalls dargestellt. Die morphologische Analyse wird dadurch durchgeführt, daß das Wörterbuch-Abfragen von dem oberen Ende der eingegebenen Zeichenfolge nacheinander entsprechend der Abrufschlüsselzeichenfolge befohlen wird, und die Verarbeitung für die Wörterbuch-Information die von dem Wörterbuch-Abfrageabschnitt 3104 erhalten worden ist, gemäß einem numerischen Flag durchgeführt wird, was später noch beschrieben wird. FIG. 17 shows an example of a detailed structure for processing numbers in the morphological analysis section 3016 . Of course, the portion 3016 also has another functional portion, but the parts which are immediately required for the understanding of the invention are also shown here. The morphological analysis is performed by commanding dictionary retrieval from the upper end of the input character string successively according to the retrieval key string, and processing for the dictionary information obtained from the dictionary retrieval portion 3104 according to a numerical flag becomes what will be described later.

Der Abschnitt 3016 weist einen Eingangs-Verarbeitungsabschnitt 3100 zum Aufnehmen und Verarbeiten der Daten für die von dem Vorverarbeitungsabschnitt 3014 eingegebene Zeichenfolge auf. Der Abschnitt 3100 ist mit einem Puffer für eine eingegebene Zeichenfolge versehen, welcher die Eingabe von englischen Zeichenfolgedaten in Form von Kodedaten beispielsweise von ASCII-Daten erhält, und die Zeichenfolgedaten vorübergehend speichert.The section 3016 has an input processing section 3100 for receiving and processing the data for the character string input from the preprocessing section 3014 . The section 3100 is provided with an input string buffer which receives the input of English string data in the form of code data of, for example, ASCII data, and temporarily stores the string data.

Die eingegebenen Zeichenfolgedaten, die vorübergehend in dem Abschnitt 3100 gespeichert sind, werden an einen Einheiten- Aufteilabschnitt 3102 abgegeben, welcher die eingegebenen Zeichenfolgedaten in Wörterbuch-Abfrageeinheiten, wie Wörter, aufteilt. Der Abschnitt 3102 ist ein Funktionsabschnitt, um die Wörterbuch-Bezugseinheiten zu unterscheiden, welche eine Abrufschlüssel-Zeichenfolge beim Abrufen des Wörterbuchs 3018 in dem Abschnitt 3104 darstellen. Die Wörterbuch-Bezugsabgrenzungen, die für das Aufteilen der Wörterbuch-Bezugseinheit verwendet worden sind, sind an einer bestimmten Stelle eines englischen Zeichens, eines numerischen Zeichens, eines Apostrophs, eines Zeichens außer einem Bindestrich und einem Absatz sowie bei einem Apostroph angeordnet, welcher auf ein Leerzeichen folgt. Dies ist in einer Begrenzungstabelle 3108 gespeichert und hierauf wird beim Aufteilen der Wörterbuch-Bezugseinheit in dem Abschnitt 3102 Bezug genommen. Das Wort-Wörterbuch 3018 enthält insbesondere Information zum Abfragen der aufgeteilten Einheiten. Wie durch das Beispiel der Eintraginformation in Fig. 24 dargestellt ist, werden für den Eintrag jeder der Wörterbuch- Bezugseinheiten, z. B. Worte, Grammatik-Informationen, wie ein Satzteil sowie eine Unterscheidungs- Anzeige gespeichert, welche anzeigt, daß ein Wort eine Zahl darstellt, d. h. ein numerisches Flag und eine Zahlenwert- Information, welche den Zahlenwert für ein die Zahl anzeigendes Wort anzeigt.The input character string data temporarily stored in the section 3100 is output to a unit dividing section 3102 which divides the inputted character string data into dictionary query units such as words. Section 3102 is a functional section for distinguishing the dictionary reference units which represent a fetch key string in retrieving dictionary 3018 in section 3104 . The dictionary reference boundaries that have been used to divide the dictionary reference unit are located at a particular location of an English character, a numeric character, an apostrophe, a character other than a hyphen and a paragraph, and an apostrophe that points to a Space follows. This is stored in a bounding table 3108 and will be referred to in dividing the dictionary reference unit in section 3102 . In particular, word dictionary 3018 contains information for querying the split units. As illustrated by the example of the entry information in Fig. 24, for the entry, each of the dictionary reference units, e.g. For example, words, grammar information, such as a clause, and a discrimination indication are stored indicating that a word represents a number, ie, a numeric flag and numerical value information indicating the numerical value for a word indicating the number.

Wie in Fig. 17 dargestellt, werden sowohl Singular- als auch Pluralformen zusammen mit jedem der Einträge in dem Wort-Wörterbuch 3018 beschrieben, und jedes von ihnen stellt einen Eintrag dar. Das numerische Flag zeigt an, daß ein Wort eine Zahl bedeutet, wenn dafür "1" gesetzt ist. Als weitere Information werden beispielsweise registriert eine Zählbarkeit und eine Nicht-Zählbarkeit für ein Hauptwort, eine Identifizierung für transitive oder intransitive Verben, für übersetzte Worte, usw. Da bei Bezugnahme beispielsweise auf "tausend" dies ein Substantiv ist, das eine Zahl darstellt, ist hierfür das numerische Flag "1", und der numerische Wert ist "1000". Da unter Bezugnahme auf "Faden" dies ein Substantiv, jedoch kein Substantiv ist, das eine Zahl anzeigt, d. h. keine Zahl ist, wird das numerische Flag als "0" registriert.As shown in Fig. 17, both singular and plural forms are described along with each of the entries in the word dictionary 3018 , and each of them represents an entry. The numeric flag indicates that a word means a number when for this "1" is set. As further information, for example, countability and non-countability are registered for a noun, identification for transitive or intransitive verbs, translated words, etc. Since, for example, referring to "thousand", this is a noun representing a number this is the numeric flag "1", and the numeric value is "1000". Since, referring to "thread", this is a noun but not a noun indicating a number, ie, not a number, the numeric flag is registered as "0".

Das Erkennen der Zahl wird daher durch das numerische Flag für den Fall durchgeführt, daß es ein Wort ist, das in dem Wörterbuch 3018 beispielsweise als "eins" oder als "tausend" registriert ist. Sogar die nicht-registrierten Worte, wie beispielsweise eine Folge von Ziffern, wie "123" zwei Satz von Ziffernfolgen mit einem "Punkt" dazwischen, nämlich eine kleine Zahl, wie beispielsweise "10.2" und auch die Folge von numerischen Zeichen mit einem Komma dazwischen, wie beispielsweise "1,000,000" werden ebenfalls als Zahlen erkannt. In der vorliegenden Beschreibung betrifft der Begriff "numerisches Zeichen" gewöhnlich nicht nur arabische Zahlen, sondern auch einen ausgeschriebenen numerischen Ausdruck, wie z. B. "dreizehn". The recognition of the number is therefore performed by the numerical flag in the case that it is a word registered in the dictionary 3018 as, for example, "one" or "thousand". Even the unregistered words, such as a string of digits, such as "123", are two sets of digits with a "dot" in between, namely a small number, such as "10.2", and also the sequence of numeric characters with a comma in between , such as "1,000,000" are also recognized as numbers. In the present specification, the term "numeric character" usually refers not only to Arabic numerals but also to an out-of-order numeric expression such as a numeric expression. "Thirteen".

Wie in Fig. 28 dargestellt, weist das Wörterbuch 1318 eine Währungssymbol-Tabelle 3018a, in welcher verschiedene Währungssymbole registriert sind, eine Notationssymbol-Tabelle 3018b, in welcher Notationssymbole ", ""."" (Zwischenraum)" registriert sind, und eine Dezimalpunkt-Tabelle 3018c auf, in welcher Dezimalpunkte "."",", usw. registriert sind. Die Tabellen für die Notationssymbole oder die Dezimalpunkte werden so verwendet, daß "," für das Notationssymbol verwendet wird, oder "." für den Dezimalpunkt im Japanischen oder Englischen verwendet wird, während ein Zwischenraum oder "Punkt" hauptsächlich für das Notationssymbol und "," für den Dezimalpunkt in anderen europäischen Sprachen, wie Deutsch oder Französisch, verwendet wird; folglich ist der Gebrauch von Symbolen zwischen zu verarbeitenden Sprachen unterschiedlich.As shown in Fig. 28, the dictionary 1318 has a currency symbol table 3018a in which various currency symbols are registered, a notation symbol table 3018b in which notation symbols ","".""" (Space) "are registered, and a decimal point table 3018c in which decimal points "."",", etc. are registered. The tables for the notation symbols or the decimal points are used so that "," is used for the notation symbol, or "." for the decimal point in Japanese or English, while a space or "dot" is used mainly for the notation symbol and "," for the decimal point in other European languages such as German or French; consequently, the use of symbols between languages to be processed is different.

Der Wörterbuch-Abrufabschnitt 3104 ist ein Funktionsabschnitt, welcher die Wörterbuchinformation durch Abrufen des Wort- Wörterbuchs 3018 basierend auf der Abruf-Zeichenfolge ausschneidet, die von dem Eingabe-Aufteilabschnitt 3102 eingegeben worden sind, und dieselben an die Verarbeitungsabschnitte 3110, 3112 und 3116 überträgt.The dictionary retrieving section 3104 is a functional section which extracts the dictionary information by retrieving the word dictionary 3018 based on the retrieval string input from the input splitting section 3102 and transmitting them to the processing sections 3110, 3112 and 3116 .

Eine Zusammenstellung für das Aufeinanderfolgen von numerischen Zeichen wird durch die folgenden beiden Verarbeitungen durchgeführt. Wenn Worte als eine Zahl erkannt werden, wie es vorstehend beschrieben ist, wenn auf die nächste Wörterbuch- Bezugseinheit Bezug genommen wird und sie auch als eine Zahl erkannt wird, werden sie zuerst gemeinsam angeordnet, um sie in einer einzigen Zahl zusammenzufügen. Die Operation wird wiederholt, solange Ziffern nachfolgen. Beispielsweise wird "30 tausend" in "30000" umgeformt, und "1.5 Millionen" wird in "1500000" umgeformt. Wenn dann der numerische Ausdruck weiter fortgeführt wird, wobei "und" dazwischen gesetzt wird wenn alle die Ziffern links von "Null", welche jeweils den Ziffern der numerischen Werte entsprechen, was durch den Zeiger rechts von "und" angezeigt ist, im Hinblick auf die Bedeutung des numerischen Ausdrucks "0" sind, werden sie in eine Zahl zusammengefaßt. Beispielsweise wird "einhundertunddreißig" in "130" zusammengefaßt, während "30 tausend und zweihundert" in "30200" zusammengefaßt wird.A compilation for the succession of numerical Character is through the following two processes carried out. When words are recognized as a number, like described above, when referring to the next dictionary Reference unit is referred to and also as a Number is recognized, they are first arranged together to them in a single number. The operation will repeated as long as digits follow. For example "30 thousand" transformed into "30000", and "1.5 million" becomes transformed into "1500000". If then the numeric expression is continued, with "and" interposed if all the digits are left of "zero", which is the one Numerals of the numerical values correspond to what is indicated by the Pointer right from "and" is displayed, with regard to Meaning of the numeric expression "0", they are in a number summarized. For example, "one hundred and thirty" in "130", while "30 thousand and two hundred "in" 30200 "is summarized.

Nach der Erkennung als Zahl wird dann die notwendige lokale Analyse, das sogenannte Parsing weiter durchgeführt. Bei dieser Verarbeitung sind eine Reihe von sogenannten Parsing-Einheiten, welche durch die Morphem-Betätigungsinformation für jede der Parsing-Einheiten betätigt worden sind, gemeinsam in einer einzigen Parsing-Einheit angeordnet, was auf einer lokalen Analysierregel beruht. Beispielsweise sind ein Währungssymbol und ein numerischer Wert wie "Y1,000" gemeinsam als "tausend Yen" angeordnet, und ein numerischer Wert und eine Einheit wie "1.5 km" angeordnet.After recognition as a number then becomes the necessary local Analysis, the so-called parsing continued. at This processing is a series of so-called Parsing units, which by the morpheme actuation information for each of the parsing units has been actuated are arranged together in a single parsing unit, which is based on a local parse rule. For example are a currency symbol and a numeric value like "Y1,000" arranged together as "a thousand yen", and a numerical value and a unit such as "1.5 km" arranged.

Diese Anordnungen werden in den Verarbeitungsabschnitten 3110 bis 3122 durchgeführt. Der Verarbeitungsabschnitt 3110 ist ein Funktionsabschnitt zum kollektiven Anordnen einer Zahl zusammen mit einem Währungssymbol oder einer Einheit. Der Verarbeitungsabschnitt 3112 ist ein Funktionsabschnitt zum Durchführen der Numerierung für die Zahl. Ferner ist der Verarbeitungsabschnitt 3114 ein Funktionsabschnitt zum Verarbeiten von Zahlen, die durch einen Bindestrich verbunden sind. Der Verarbeitungsabschnitt 3116 ist ein Funktionsabschnitt zum Verarbeiten von aufeinanderfolgenden numerischen Zeichen.These arrangements are performed in the processing sections 3110 to 3122 . The processing section 3110 is a functional section for collectively arranging a number together with a currency symbol or a unit. The processing section 3112 is a function section for performing the numbering for the number. Further, the processing section 3114 is a function section for processing numbers connected by a hyphen. The processing section 3116 is a functional section for processing consecutive numeric characters.

Im Hinblick auf die Zahl nach der Zusammenstellung mit einem Währungssymbol oder einer Einheit wird die Zusammenstellung zwischen dem Währungssymbol und der numerischen Zahl zu einem einzigen Hauptwort in dem Verarbeitungsabschnitt 3118 durchgeführt. Ferner wird die Zusammenstellung zwischen der Einheit und dem numerischen Wert zu einem Eigennamen in dem Verarbeitungsabschnitt 3120 durchgeführt. Ferner wird im Fall einer Zahl, welche einer Verarbeitung bezüglich einer Numerierung unterworfen worden ist, im Falle einer mit Bindestrich versehenen Zahl und für eine fortlaufende Zahl eine Verarbeitung für eine Zusammenstellung mit dem vorausgehenden numerischen Wert in dem Verarbeitungsabschnitt 3122 durchgeführt. Die Wörterbuch-Information für die eingegebene Zeichenfolge, die mit Hilfe einer derartigen Verarbeitung vervollständigt ist, wird in dem geordneten Wörterbuch-Informationspuffer, d. h. in der Wörterbuch-Informations-Konservierungstabelle 3124 gespeichert. Die Ergebnisse der morphologischen Analyse werden von der Tabelle 3124 an den Parsing- Abschnitt I 3020 übertragen. Eine Verarbeitung durch das numerische Flag wird nacheinander durchgeführt, wie in Fig. 19A und 10B dargestellt ist. Die Daten für die eingegebene Zeichenreihe werden an dem Eingabe-Verarbeitungsabschnitt 3100 empfangen, wo eine Eingabe-Verarbeitung durchgeführt wird (3200). Dann teilt der Einheiten-Aufteilabschnitt 3102 die eingegebene Zeichenreihe in Wörterbuch-Bezugseinheiten zum Abfragen des Wörterbuchs 3018 auf (3201). Der Wörterbuch- Abrufabschnitt 3104 sucht das Wörterbuch 3018 dementsprechend ab (3203) und, wenn es einen Wörterbuch-Eintrag gibt (3204) prüft er das numerische Flag (3205). Wenn das numerische Flag nicht gesetzt ist, da das Wort keine Zahl ist, wird die Wörterbuch-Information in der Tabelle 3124 gesammelt. Wenn "1" für das numerische Flag gesetzt ist, wird die Zahl in dem Verarbeitungsabschnitt 3112 numeriert (3206) und die kollektive Zusammenstellung 3207 mit dem vorhergehenden Zahlenwert wird in dem Abschnitt 3122 durchgeführt. Wenn diese Verarbeitungen für die Endstelle eines Satzes durchgeführt sind, was durch die eingegebenen Zeichenreihendaten angezeigt ist (3202) wird eine kollektive Zusammensetzung (3209) zwischen dem Währungssymbol oder der Einheit in den Abschnitten 3118 und 3120 durchgeführt, das Ergebnis der morphologischen Analyse wird dann in den syntaktischen Analyseabschnitt I (3020) abgegeben (3210).With regard to the number after being composed with a currency symbol or a unit, the composition between the currency symbol and the numerical number is made into a single noun in the processing section 3118 . Further, the composition between the unit and the numerical value is made a proper name in the processing section 3120 . Further, in the case of a number which has been subjected to processing for numbering, in case of a hyphenated number and for a consecutive number, processing for compilation with the preceding numerical value is performed in the processing portion 3122 . The dictionary information for the inputted character string completed by such processing is stored in the ordered dictionary information buffer, ie, the dictionary information preservation table 3124 . The results of the morphological analysis are transferred from Table 3124 to parsing section I 3020 . Processing by the numerical flag is performed sequentially, as shown in Figs. 19A and 10B. The data for the input character string is received at the input processing section 3100 where input processing is performed ( 3200 ). Then, the unit dividing section 3102 divides the input character string into dictionary reference units for retrieving the dictionary 3018 ( 3201 ). The dictionary retrieval section 3104 searches the dictionary 3018 accordingly ( 3203 ) and, if there is a dictionary entry ( 3204 ), checks the numeric flag ( 3205 ). If the numeric flag is not set because the word is not a number, the dictionary information is collected in the table 3124 . If "1" is set for the numerical flag, the number is numbered ( 3206 ) in the processing section 3112, and the collective map 3207 having the previous numerical value is performed in the section 3122 . When these processings are performed for the terminal of a sentence indicated by the entered character row data ( 3202 ), a collective composition ( 3209 ) is performed between the currency symbol or the unit in the sections 3118 and 3120 , the result of the morphological analysis is then written to the syntactic analysis section I ( 3020 ) is issued ( 3210 ).

Wenn als ein Ergebnis der Wörterbuch-Referenz es keinen Eintrag beim Schritt 3204 gibt, und wenn das Element mit Hilfe eines Bindestrichs angefügt ist, wird eine Verarbeitung für eine mit Bindestrich versehene Zahl (3213) in dem Abschnitt 3114 durchgeführt. Wenn das erste Zeichen kein mit Bindestrich versehenes Zeichen, sondern ein Währungssymbol ist (3214) wird das Währungssymbol allein in der Tabelle 3124 konserviert (3216), und das Währungssymbol wird von der Wörterbuch-Bezugseinheit gestrichen (3217). Wenn das erste Zeichen kein Währungssymbol ist (3214) wird die Verarbeitung für die nachfolgenden numerischen Zeichen (3215) in dem Verarbeitungsabschnitt 3116 durchgeführt. Die Operation wird bis zur Endstelle durchgeführt (3202).If, as a result of the dictionary reference, there is no entry at step 3204 , and the element is appended by means of a hyphen, processing for a hyphenated number ( 3213 ) is performed in section 3114 . If the first character is not a hyphenated character but a currency symbol ( 3214 ), the currency symbol is preserved ( 3216 ) alone in the table 3124 , and the currency symbol is deleted from the dictionary reference unit ( 3217 ). If the first character is not a currency symbol ( 3214 ), the processing for the subsequent numeric characters ( 3215 ) is performed in the processing section 3116 . The operation is performed to the terminal ( 3202 ).

Die kollektive Zusammenstellung (3209) mit dem Währungssymbol und der Einheit wird in dem Verarbeitungsabschnitt 3110 durch den in Fig. 20 dargestellten Verarbeitungsfluß durchgeführt. Zuerst wird bei der Anfangsverarbeitung (3220) der obere Zeiger für die Verarbeitung zuerst an die Oberseite des Puffers gesetzt. Wenn das durch den Zeiger angezeigte Element kein Zahlenwert ist (3221) wird der Zeiger schrittweise verschoben (3226). Für den Fall daß das Zeichen ein numerischer Wert ist, aber kein vorausgehendes Währungssymbol und keine vorausgehende Einheit hat, wird der Zeiger ebenfalls schrittweise weiter verschoben (3222, 3224). Die Verarbeitung wird bis zur Endstelle der Wörterbuch- Bezugseinheit durchgeführt (3227).The collective composition ( 3209 ) with the currency symbol and the unit is performed in the processing section 3110 by the processing flow shown in FIG . First, in the initial processing ( 3220 ), the upper pointer for processing is first set to the top of the buffer. If the item indicated by the pointer is not a numeric value ( 3221 ), the pointer is shifted stepwise ( 3226 ). In the event that the character is a numeric value but has no previous currency symbol and no preceding unit, the pointer is also incrementally moved ( 3222, 3224 ). The processing is carried out to the terminal of the dictionary reference unit ( 3227 ).

Wenn das Zeichen ein Zahlenwert ist (3222), werden das Währungssymbol und der Zahlenwert zu einem einzigen Hauptwort kollektiv zusammengesetzt (3223). Beispielsweise wird das Währungssymbol und das numerische Zeichen ≚ 1,000" zu einem Hauptwort zusammengesetzt. Wenn ferner das vorhergehende Zeichen kein Währungssymbol und das nachfolgende Zeichen eine Einheit ist, werden der Zahlenwert und die Einheit kollektiv zu einem einzigen Hauptwort zusammengesetzt (3225). Beispielsweise sind ein numerisches Zeichen und eine Einheit "1.5 km" kollektiv zu einem einzigen Hauptwort zusammengesetzt. Die Verarbeitung wird bis zur Endstelle der Wörterbuch- Abrufeinheit durchgeführt (3227). If the character is a numeric value ( 3222 ), the currency symbol and the numerical value are collectively assembled into a single noun ( 3223 ). For example, if the preceding symbol is not a currency symbol and the subsequent symbol is a unit, the numerical value and the unit are collectively assembled into a single noun ( 3225 ) numerical character and a unit "1.5 km" are collectively assembled into a single main word, and processing is performed up to the terminal of the dictionary retrieving unit ( 3227 ).

Die Verarbeitung der mit Bindestrich versehenen Zahl wird in dem Verarbeitungsabschnitt 3114 durch das in Fig. 21A und 21B dargestellte Flußdiagramm durchgeführt. Zuerst wird die mit Bindestrich versehene Wörterbuch-Bezugseinheit bei der Anfangsverarbeitung in dem Puffer gespeichert (3230). Ferner wird der numerische Wert "0" konserviert, und der Bindestrich in der ursprünglichen Wörterbuch-Bezugseinheit wird in einen Zwischenraum geändert. Dann wird die Wörterbuch- Abrufeinheit unterteilt (3231), um ein Abrufen bzw. Auffinden im Wörterbuch durchzuführen (3235). Wenn als Ergebnis des Wörterbuch-Abrufens kein Eintrag da ist, d. h. wenn das Wort nicht in dem Wörterbuch registriert ist (3236) wird die ganze mit Bindestrich versehene Wörterbuch-Bezugseinheit als ein nicht im Wörterbuch registriertes Wort in der Tabelle 3124 aufbewahrt (3237).The processing of the hyphenated number is performed in the processing section 3114 by the flowchart shown in Figs. 21A and 21B. First, the hyphenated dictionary reference unit is stored in the buffer at the initial processing ( 3230 ). Further, the numerical value "0" is preserved, and the hyphen in the original dictionary reference unit is changed to a space. Then, the dictionary retrieving unit is divided ( 3231 ) to perform dictionary retrieval ( 3235 ). If there is no entry as a result of the dictionary retrieval, ie if the word is not registered in the dictionary ( 3236 ), the whole hyphenated dictionary reference unit is stored as a non-dictionary registered word in the table 3124 ( 3237 ).

Wenn als Ergebnis des Wörterbuch-Abrufens ein Eintrag erhalten wird (3236), wird geprüft, ob das numerische Flag hierfür "1" ist oder nicht. Wenn das numerische Flag keine "1" ist, bedeutet dies, daß das Zeichen kein numerisches Zeichen ist, und die ganze mit Bindestrich versehene Bezugseinheit wird als ein nicht im Wörterbuch registriertes Wort in der Tabelle 3124 aufbewahrt (3237).When an entry is obtained as a result of the dictionary retrieval ( 3236 ), it is checked whether or not the numerical flag for this is "1". If the numerical flag is not a "1", it means that the character is not a numeric character, and the whole hyphenated unit of reference is kept as a non-dictionary registered word in the table 3124 ( 3237 ).

Wenn "1" bei dem numerischen Flag für den Wörterbucheingang gesetzt ist, numeriert der Abschnitt 3012 die Zahl auf der Basis der Eingangsdaten (3239). Dann wird der numerierte Zahlenwert zu einem zu diesem Zeitpunkt aufbewahrten Zahlenwert addiert (3240) und das Additionsergebnis wird konserviert (3241). Folglich wird beispielsweise "zwei" in "zwanzig- zwei" bei "3020" entsprechend "3022" addiert. Die Verarbeitung wird bis zur Endstelle der Wörterbuch-Abrufeinheit durchgeführt (3232).When "1" is set at the numerical flag for the dictionary input, the section 3012 numbers the number based on the input data ( 3239 ). Then, the numbered numerical value is added to a numerical value stored at that time ( 3240 ), and the addition result is conserved ( 3241 ). Thus, for example, "two" is added in "twenty-two" at "3020" corresponding to "3022". The processing is performed up to the terminal of the dictionary fetch unit ( 3232 ).

Wenn schrittweise zu der Endposition vorgerückt wird, wird der Fluß beim Schritt 3232 zu der Verarbeitung 3233 übertragen, und der aufbewahrte Zahlenwert wird zu einem Zahlenwert für die gesamte mit Bindestrich versehene Wörterbuch- Bezugseinheit gemacht. Dann wird eine kollektive Zusammensetzung 3207 für den Zahlenwert zusammen mit dem vorherigen Zahlenwert durchgeführt. Nunmehr wird die nachfolgende numerische Zeichenverarbeitung 3215 anhand von Fig. 22A und 22B erläutert, was in dem Verarbeitungsabschnitt 3116 durchgeführt wird. In diesen Flußdiagrammen bedeutet das Symbol "≦" Substitution. Zuerst wird eine Initialisierung 3250 durchgeführt, bei welcher der aufbewahrte Zahlenwert val-save "0" gesetzt wird, der Parameter "i" auf "1" und der Zeiger p an das obere Ende der Zeichenreihe der Wörterbuch- Bezugseinheit gesetzt wird.As the end position is progressively advanced, the flow is transferred to the processing 3233 at step 3232 , and the stored numerical value is made a numerical value for the entire hyphenated dictionary reference unit. Then, a collective composition 3207 for the numerical value is performed together with the previous numerical value. Next, the following numeric character processing 3215 will be explained with reference to FIGS. 22A and 22B, which is performed in the processing section 3116 . In these flowcharts, the symbol "≦" means substitution. First, an initialization 3250 is performed in which the stored numerical value val-save "0" is set, the parameter "i" is set to "1" and the pointer p is set to the upper end of the character string of the dictionary reference unit.

Dann wird geprüft, ob das Zeichen *p, das durch den Zeiger p angezeigt ist, ein numerisches Zeichen (3251), ein Notationszeichen (3252) oder ein Dezimalpunkt (3253) ist. Wenn es keines davon ist, wird die gesamte Zeichenreihe als das nicht im Wörterbuch registrierte Wort in der Tabelle 3124 gespeichert (3255). Wenn es ein Dezimalpunkt ist (3253), wird der Parameter (i) mit 10 multipliziert (3254) und der Schritt 3258 wird durchgeführt. Beim Schritt 3258 wird der Zahlenwert num (*p) für das Zeichen *p zu dem aufbewahrten Zahlenwert val-save addiert, um einen neuen aufbewahrten Zahlenwert vorzubereiten. Der Zahlenwert num (*p) ist ein Wert, der das Zeichen (*p) als einen numerischen Wert betrachtet.It is then checked if the character * p indicated by the pointer p is a numeric character ( 3251 ), a notation character ( 3252 ) or a decimal point ( 3253 ). If it is none of them, the entire character string is stored as the non-dictionary registered word in the table 3124 ( 3255 ). If it is a decimal point ( 3253 ), the parameter (i) is multiplied by 10 ( 3254 ) and step 3258 is performed. At step 3258 , the numerical value num (* p) for the character * p is added to the stored val-save numerical value to prepare a new stored numerical value. The numerical value num (* p) is a value that considers the character (* p) to be a numeric value.

Wenn beim Schritt 3251 oder 3252 das Zeichen ein numerisches oder ein Notations-Zeichen ist, wird der Schritt 3257 durchgeführt. Beim Schritt 3257 wird der aufbewahrte Zahlenwert val-save mit 10 multipliziert, wozu der Zahlenwert num (*p) für das Zeichen *p addiert wird, um einen neuen aufbewahrten Zahlenwert vorzubereiten.If at step 3251 or 3252 the character is a numeric or a notation character, step 3257 is performed. At step 3257 , the stored value val-save is multiplied by 10, to which the numerical value num (* p) for the character * p is added to prepare a new stored numerical value.

Nach diesen Verarbeitungen wird der Zeiger schrittweise weiter gerückt (3259) und die Verarbeitung wird bis zur Endstelle der Wörterbuch-Bezugseinheit wiederholt (3260). Wenn die Endposition für die Zeichenreihe erreicht ist, wird der numerische Wert für die ganze Zeichenreihe als ein konservierter Zahlenwert gebildet (3261) und eine kollektive Zusammenstellung 3207 mit dem vorherigen Zahlenwert wird in dem Abschnitt 3122 durchgeführt. Durch die Verarbeitung werden die nachfolgenden numerischen Zeichen, z. B. "1,000.5" als ein Zahlenwert "1000.5" analysiert.After these processings, the pointer is incremented ( 3259 ) and the processing is repeated until the dictionary reference unit's end location ( 3260 ). When the end position for the character string is reached, the numerical value for the entire character string is formed as a conserved numerical value ( 3261 ), and a collective compilation 3207 having the previous numerical value is performed in the portion 3122 . By processing the following numeric characters, z. For example, "1.000.5" is parsed as a numeric value "1000.5".

Die kollektive Zusammenstellung 3207 mit dem vorhergehenden Zahlenwert wird folgendermaßen in dem Abschnitt 3122 durchgeführt. Zuerst wird der Zeiger der Wörterbuch-Tabelle auf eine vorhergehende Stelle der Wörterbuch-Bezugseinheit gesetzt (3270). Wenn an dieser Stelle nichts vorhanden ist, bedeutet dies, daß die erste Stelle in der Konservierungstabelle den Zahlenwert anzeigt, und daß der Zahlenwert für die laufende Wörterbuch-Bezugseinheit in der Tabelle 3124 aufgezeichnet ist (3284). Die aufgezeichnete Stelle ist die Stelle, welche der durch den Zeiger P bezeichneten Stelle am nächsten ist.The collective composition 3207 having the previous numerical value is performed in the section 3122 as follows. First, the dictionary table pointer is set to a previous location of the dictionary reference unit ( 3270 ). If there is nothing at this point, it means that the first digit in the preservation table indicates the numerical value, and that the numerical value for the current dictionary reference unit is recorded in the table 3124 ( 3284 ). The recorded location is the location closest to the location indicated by the pointer P.

Wenn beim Schritt 3271 ein Wort an der vorhergehenden Stelle vorhanden ist, wenn der durch den Zeiger p angezeigte Eintrag nicht "und" ist (3272), und der Zeiger p nicht den Zahlenwert anzeigt (3273), wird der Zahlenwert für die laufende Wörterbuch-Bezugseinheit in der Tabelle 3124 aufgezeichnet, welcher der laufenden, durch den Zeiger p angezeigten Position am nächsten ist (3284). In dem Beispiel "To him two . . ." ("zu ihm zwei . . ." ist "zwei" kürzlich als ein Zahlenwert "2" aufgezeichnet.If at step 3271 there is a word at the previous location, if the entry indicated by the pointer p is not "and" ( 3272 ), and the pointer p does not indicate the numerical value ( 3273 ), the numerical value for the current dictionary Reference unit is recorded in the table 3124 which is closest to the current position indicated by the pointer p ( 3284 ). In the example "To him two ...."("to him two ..." is "two" recently recorded as a numerical value "2".

Wenn beim Schritt 3273 der Zeiger p einen Zahlenwert anzeigt, wird der Zahlenwert p → v für den durch den Zeiger p angezeigten Eintrag mit dem numerischen Wert v-now für die aktuelle Wörterbuch-Bezugseinheit multipliziert, um einen neuen numerischen Wert p → v für den durch den Zeiger p angezeigten Eintrag zu bilden (3274). Im Falle von "zweitausend" wird beispielsweise "2 × 1000=2000" durchgeführt, um das gesamte "zweitausend" zu einem Ausdruck zusammenzusetzen. Dann wird die Endposition für die aktuelle Wörterbuch-Bezugseinheit als die Endposition für den Eintrag des Zeigers p, die p-end-Position gesetzt (3282).If, in step 3273, the pointer p indicates a numerical value, the numerical value p → v for the entry indicated by the pointer p is multiplied by the numerical value v-now for the current dictionary reference unit to obtain a new numerical value p → v for the current dictionary reference unit form the entry indicated by the pointer p ( 3274 ). For example, in the case of "two thousand", "2 × 1000 = 2000" is performed to compose the entire "two thousand" into one expression. Then, the end position for the current dictionary reference unit is set as the end position for the entry of the pointer p, the p-end position ( 3282 ).

Wenn beim Schritt 3272 der durch den Zeiger p angezeigte Eintrag "und" ist, wird der Zeiger p davor an die Wörterbuch- Abrufeinheit übertragen (3275). Wenn er sich nicht an der Endstelle befindet (3276) und wenn es ein Zahlenwert ist (3277), wird der numerische Wert v-now der gegenwärtigen Wörterbuch-Bezugseinheit übertragen und an der höchstwertigen Ziffer abgerundet, welche dann als ein Wert vl gesetzt wird. Wenn der numerische Wert v-now der aktuellen Bezugseinheit beispielsweise "8", "8.1", "98" oder "11" ist, ist der Wert vl "10", "10", "100" bzw. "100".If, at step 3272, the entry indicated by the pointer p is "and", the pointer p is previously transferred to the dictionary fetch unit ( 3275 ). If it is not at the terminal ( 3276 ) and if it is a numerical value ( 3277 ), the numerical value v-now is transmitted to the current dictionary reference unit and rounded off at the most significant digit, which is then set as a value vl. For example, if the numerical value v-now of the current reference unit is "8", "8.1", "98" or "11", the value vl is "10", "10", "100" and "100", respectively.

Dann wird geprüft, ob der Überschuß, der durch Dividieren des numerischen Werts p-v für den durch den Zeiger p angezeigten Eintrag durch vl erhalten worden ist, d. h. mod (p-v, vl) "0" ist oder nicht. Wenn er nicht "0" ist, wird der Zeiger p inkrementiert (3283) und der Zahlenwert für die aktuelle Bezugseinheit wird an einer Stelle, welcher der durch den aktuellen Zeiger p angezeigten Stelle am nächsten ist, in der Tabelle 3124 aufgezeichnet (3284). Im Falle von "I and two" (Ich und zwei) beispielsweise ist "zwei" kürzlich als ein Zahlenwert "2" aufgezeichnet.Then, it is checked whether or not the excess obtained by dividing the numerical value pv for the entry indicated by the pointer p by vl, ie, mod (pv, vl) is "0". If it is not "0", the pointer p is incremented ( 3283 ) and the numerical value for the current reference unit is recorded in table 3124 at a location which is closest to the position indicated by the current pointer p ( 3284 ). For example, in the case of "I and two", "two" is recently recorded as a numerical value "2".

Wenn der Überschuß beim Schritt 3279 "0" ist, wird der numerische Wert v-now für die aktuelle Bezugseinheit zu dem numerischen Wert p → v für den durch den Zeiger p angezeigten Eintrag addiert, um einen neuen numerischen Wert p → v für den durch den Zeiger p angezeigten Eintrag zu bilden (3280). Im Fal 99999 00070 552 001000280000000200012000285919988800040 0002003733674 00004 99880le von "zweitausend und zwei" sind beispielsweise "zweitausend" in dieser Stufe bereits zu "2000" zusammengesetzt. Dann wird es mit "2" von "zwei" durch die Addition 3200 addiert, um so kollektiv den Gesamtteil zu "2002" zusammenzusetzen. Dann wird die Information "und", welche durch den Zeiger p + 1 angezeigt ist, aus der Tabelle 3124 gelöscht (3281) und es wird auf Schritt 3282 übergegangen. Nun wird das Abrufen an einem Beispiel erläutert. Wenn ein Wörterbuch-Abrufen beispielsweise bei der eingegebenen Zeichenfolge "Zu ihm zwei tausend und zweiundzwanzig. . ." (To him two thousand and twenty two. . ."), durchgeführt wird, wie in Fig. 25 dargestellt ist, wird die Wörterbuch-Eingangsinformation in die Tabelle 3124 geschrieben, wie in Fig. 26A dargestellt ist. Für "ihm" ("him") beispielsweise ist die Startposition "4", die Endposition ist "6" und der Sprachteil ist ein Pronomen. Für die numerische Verarbeitung wird zuerst für "zwei" beurteilt, daß das numerische Flag "1" ist (3205) und der Zahlenwert hierfür "2" ist. Da das vorhergehende Zeichen zu "2" in dieser Zeichenreihe kein Zahlenwert ist, wird er unmittelbar in der Tabelle 3124 gespeichert (3206, 3208, 3284).If the excess at step 3279 is "0", the numerical value v-now for the current reference unit is added to the numerical value p → v for the entry indicated by the pointer p to obtain a new numerical value p → v for the current value make the pointer p displayed ( 3280 ). In the case of "two thousand and two", for example, "two thousand" are already composed at "2000" in this stage. Then it is added with "2" of "two" by the addition 3200 , so collectively composing the whole thing to "2002". Then, the information "and" indicated by the pointer p + 1 is deleted from the table 3124 ( 3281 ), and it goes to step 3282 . Now the retrieval is explained with an example. For example, if a dictionary retrieves the input string "To him two thousand and twenty-two ...." As shown in Fig. 25, the dictionary input information is written in the table 3124 as shown in Fig. 26A. for example, the start position is "4", the end position is "6", and the speech part is a pronoun For numerical processing, it is first judged for "two" that the numerical flag is "1" ( 3205 ) and the numerical value Since the previous character of "2" in this character string is not a numerical value, it is immediately stored in the table 3124 ( 3206, 3208, 3284 ).

Dann wird der Zeiger inkrementiert, um zur Verarbeitung von "tausend" überzugehen. Das numerische Flag ist "1", und der Zahlenwert ist "tausend" (3205, 3206). Da außerdem der Zahlenwert des vorherigen Zeichens "2" ist (3207, 3273), wird die Multiplikation: 2 × 1000 durchgeführt (3274), und dessen Ergebnis wird in der Tabelle 3124 gespeichert (siehe Fig. 26B). Für das nächste "und" wird dann die Wörterbuchinformation vorübergehend, so wie sie ist, in der Tabelle 3124 gespeichert (siehe Fig. 26C).Then the pointer is incremented to go to the processing of "thousand". The numeric flag is "1" and the numerical value is "thousand" ( 3205, 3206 ). In addition, since the numerical value of the previous character is "2" ( 3207, 3273 ), the multiplication: 2 × 1000 is performed ( 3274 ), and its result is stored in the table 3124 (see Fig. 26B). For the next "and", then the dictionary information is temporarily stored as it is in the table 3124 (see Fig. 26C).

Der Zeiger wird dann vorgerückt, um "zwanzig-zwei" zu verarbeiten. Da die Worte ein mit Bindestrich versehenes Wort sind, das nicht im Wörterbuch gefunden wird (3212), wird "20 + 2=22" bei der Bearbeitung 3213 für die mit Bindestrichen versehenen Zahlen durchgeführt (3237, 3239 bis 3241). Da das vorhergehende Wort "und" ist (3272) und der dort vorhergehende Zahlenwert "2000" ist (3277), wird der Zahlenwert "22" an der höchstwertigen Stelle auf "100" abgerundet (3278) und eine Teilungsoperation 3279 durchgeführt. Da der Überschuß "0" ist, wird eine Addition "3280" zwischen "2000" und "22" durchgeführt. Die Information für "und" wird aus der Tabelle 3124 beseitigt (3282), und das Ergebnis der Addition (2022) wird als ein Zahlenwert in der Tabelle 3124 aufbewahrt, wodurch "zweitausend und zwanzig- zwei" als "2022" erkannt wird. Folglich ist eine kollektive Zusammensetzung mit dem vorhergehenden Zahlenwert durchgeführt worden (3207). Nunmehr wird ein weiteres Beispiel dargestellt. Wie in Fig. 27 dargestellt, wird ein sogenanntes Parsing für die eingegebene Zeichenfolge "sie sagt $1,000.5 tausend war . . ." (You said $1,000.5 thousand was . . .). "$1,000.5" war nicht in dem Wörterbuch 3018 registriert. Das erste Zeichen ist das Währungssymbol "$", welches als das Währungssymbol aus dem Wörterbucheingang erkannt werden kann. Dies ist unabhängig in der Tabelle 3124 aufgezeichnet (3214, 3216, Fig. 29A).The pointer is then advanced to process "twenty-two". Since the words are a provided hyphenated word that is not found in the dictionary (3212), is "20 + 2 = 22" in the processing 3213 for the provided hyphenated numbers performed (3237, 3239-3241). Since the previous word "and" is ( 3272 ) and the previous numerical value "2000" is ( 3277 ), the numerical value "22" at the most significant digit is rounded down to "100" ( 3278 ) and a division operation 3279 is performed. Since the excess is "0", an addition "3280" is made between "2000" and "22". The information for "and" is eliminated from the table 3124 ( 3282 ), and the result of the addition ( 2022 ) is kept as a numerical value in the table 3124 , whereby "two thousand and twenty-two" is recognized as "2022". Consequently, a collective composition having the previous numerical value has been performed ( 3207 ). Now another example is presented. As shown in Fig. 27, so-called parsing for the input string is "$ 1,000.5 thousand. (You said $ 1,000,5 thousand was ...). "$ 1,000.5" was not registered in the 3018 dictionary. The first character is the currency symbol "$", which can be recognized as the currency symbol from the dictionary input. This is recorded independently in the table 3124 ( 3214, 3216, Fig. 29A).

Dann wird "1,0005." in einen Zahlenwert "1000.5" durch die nachfolgende numerische Zeichenverarbeitung 3215 gebildet. Da das vorhergehende Zeichen das Symbol "$" und nicht ein Zahlenwert ist, wird der Zahlenwert an sich aufgezeichnet (3280 bis 3273, 3284, Fig. 29B).Then "1.0005." is formed into a numerical value "1000.5" by the subsequent numerical character processing 3215 . Since the previous character is the symbol "$" and not a numerical value, the numerical value is recorded per se ( 3280 to 3273, 3284, Fig. 29B).

Das nächste Wort "thousand" ist eine Zahl, und dessen Zahlenwert ist "1000". Da das vorhergehende Zeichen ein Zahlenwert ist (3272, 3273), wird eine Berechnung durchgeführt: "1000.5 × 1000=1000500" (3274, Fig. 29C). Nachdem das Wörterbuch- Abrufen beendet worden ist, wird der in der Tabelle 3174 aufbewahrte Inhalt nacheinander überprüft. Da das Währungssymbol "$" unmittelbar vor dem Zahlenwert "1000500" vorhanden ist, werden beide kollektiv zusammengesetzt, und es wird "$1000500" als eine einzige Wortgabe gebildet (3209, 3221 bis 3223, Fig. 29D).The next word "thousand" is a number and its numerical value is "1000". Since the previous character is a numerical value ( 3272, 3273 ), a calculation is made: "1000.5 × 1000 = 1000500" ( 3274 , Fig. 29C). After the dictionary retrieval has been completed, the content stored in the table 3174 is checked one by one. Since the currency symbol "$" exists immediately before the numerical value "1000500", both are collectively assembled and "$ 1000500" is formed as a single word input ( 3209, 3221 to 3223 , Fig. 29D).

Nunmehr wird die vierte Ausführungsform erläutert. In Fig. 30 ist die vierte Ausführungsform des Sprachanalysators mit Merkmalen nach der Erfindung dargestellt, der bei einer automatischen Übersetzungseinrichtung für Englisch- Japanisch angewendet ist.Now, the fourth embodiment will be explained. Fig. 30 shows the fourth embodiment of the speech analyzer having features of the invention applied to an English-Japanese automatic translator.

Diese Ausführungsform hat einen Eingabe-Verarbeitungsabschnitt 4014, in welchen Daten von einer Eingabeeinrichtung 4012 eingegeben werden. Die Eingabeeinrichtung 4012 weist beispielsweise ein Tastenfeld mit Zeichentasten, wie alphanumerische und Funktions-Tasten, eine optische Zeichenaufzeichnungseinrichtung zum Lesen eines auf Papier aufgezeichneten englischen Textes und einen Leser, wie eine Magnetplatte auf.This embodiment has an input processing section 4014 into which data is input from an input device 4012 . The input device 4012 includes, for example, a keyboard with character keys such as alphanumeric and function keys, an optical character recorder for reading an English text recorded on paper, and a reader such as a magnetic disk.

Der Eingabeverarbeitungsabschnitt 4014 hat einen Puffer 4014a für eine eingegebene Zeichenreihe und speichert den von der Einrichtung 4012 eingegebenen englischen Satz in dem Puffer 4014a. Der Abschnitt 4014 liest einen in dem Puffer 4014a gespeicherten eingegebenen Satz und gibt ihn an den Einheiten-Aufteilabschnitt 4016 ab. Der Abschnitt 4016 ist ein Funktionsabschnitt, welcher die Wörterbuch-Bezugseinheit des von dem Abschnitt 4014 eingegebenen Satzes aufteilt. In einer Abgrenzungstabelle 4018 sind Abgrenzungen, wie Zwischenraum und Komma, gespeichert. Der Abschnitt 4016 speichert Abgrenzungen aus der Abgrenzungstabelle 4018 und teilt den eingegebenen Satz von dem Abschnitt 4014 in Zeichenreihen als die Einheiten für den Fall eines Abrufens eines Bezugsspeichers 4020, indem der Satz an den Stellen geteilt wird, wo die Abgenzungen vorhanden sind. Die aufgeteilten Zeichenreihen werden in den Wörterbuch-Abrufabschnitt 4022 eingegeben.The input processing section 4014 has a buffer 4014 for a an input character string, and stores the input from the device 4012 English sentence in the buffer 4014 a. The section 4014 reads an input sentence stored in the buffer 4014 a and delivers it to the unit partitioning section 4016 . The section 4016 is a function section which divides the dictionary reference unit of the sentence input from the section 4014 . In a delimitation table 4018 delimitations, such as space and comma, are stored. The section 4016 stores boundaries from the delimitation table 4018 and divides the input sentence from the section 4014 into character strings as the units in the case of fetching a reference memory 4020 by dividing the sentence at the places where the separations exist. The divided character strings are input to the dictionary retrieval section 4022 .

Der Abschnitt 4022 ruft das Referenz-Wörterbuch 4020 für den eingegebenen Satz ab, der in Bezugseinheiten aufgeteilt ist, welche von dem Abschnitt 4016 abgegeben worden sind. Das Referenzwörterbuch 4020 speichert Einträge beispielsweise für die englischen Zeichenfolgen, Sprachteile davon, Merkmalsinformationen, usw. wie in Fig. 31 dargestellt ist. Das Wörterbuch 4020 speichert zusätzlich beispielsweise in der Figur dargestellte Eigennamen, Zeichenfolgen, für einen anderen Sprachteil, Verben und Adjektiva. Die Aufzeichnung für den Eigennamen als den Sprachteil in der Figur bedeutet, daß dieser bei der Verarbeitung für den registrierten Eigennamen angewendet wird, was später noch beschrieben wird, aber daß keine üblichen grammatikalischen Eigennamen ausgedrückt werden. Ferner zeigen die Merkmalsinformationen das an, was durch den Eigennamen dargestellt ist, was nicht immer nur auf einen beschränkt sein kann, da, wie später beschrieben wird, ein Eigennamen in Abhängigkeit von dem Anwendungsfall einer Anzahl Merkmale darstellt.The section 4022 retrieves the reference dictionary 4020 for the input sentence, which is divided into reference units that have been submitted from the section 4016 . For example, the reference dictionary 4020 stores entries for the English character strings, language parts thereof, feature information, etc. as shown in FIG . The dictionary 4020 additionally stores, for example, proper names shown in the figure, character strings, for another part of speech, verbs and adjectives. The record for the proper name as the speech part in the figure means that it is applied in the processing for the registered proper name, which will be described later, but that no ordinary common grammatical names are expressed. Further, the feature information indicates what is represented by the proper name, which may not always be limited to one, since, as will be described later, a proper name represents a number of features depending on the use case.

Der Abschnitt 4022 ruft das Wörterbuch 4022 für die in Bezugseinheiten aufgeteilte Zeichenfolge ab, und wenn die Zeichenfolge ein Eigenname ist, wird er an den Eigennamen- Verarbeitungsabschnitt 4024 abgegeben, wo eine Verarbeitung für den Eigennamen durchgeführt wird, was später noch beschrieben wird. Wenn es kein Eigenname ist, wird dies an den Verarbeitungsabschnitt 4036 abgegeben und in dessen Tabelle 4036a aufbewahrt.The section 4022 retrieves the dictionary 4022 for the reference-unit-divided string, and when the character string is a proper name, it is delivered to the proper name processing section 4024 where processing for the proper name is performed, which will be described later. If it is not a proper name, it is delivered to the processing section 4036 and stored in its table 4036 a.

Der Eigennamen-Verarbeitungsabschnitt 4024 weist einen Verarbeitungsabschnitt für das vorhergehende Satzende, einen Verarbeitungsabschnitt 4028 für einen vorhergehenden Eigennamen, einen Verarbeitungsabschnitt 4030 für den Eigennamen an sich, einen Verarbeitungsabschnitt 4032 für den vorhergehenden Eigennamen und den Eigennamen an sich und einen Abschnitt 4034 für eine Vorgabe- oder Standard-Merkmalsinformation auf. Der Abschnitt 4026 beurteilt, ob die Zeichenfolge, die der Zeichenfolge vorangeht, die durch den Abschnitt 4022 abgerufen und von dem Abschnitt 4022 aus eingegeben worden ist, sich am Ende des Satzes befindet oder nicht. Wenn die vorhergehende Zeichenfolge sich am Ende des Satzes befindet, setzt sie den Großbuchstaben am Anfang der zu verarbeitenden Zeichenfolge in einen kleinen Buchstaben um, gibt dies an den Wörterbuch-Abrufabschnitt 4022 ab und bewirkt, daß der Abschnitt 4022 das Referenzwörterbuch 4020 wieder abruft. Die Zeichenfolge, die sogar durch den zweiten Abrufvorgang nicht abberufen worden ist, wird als ein nicht- registrierter Eigenname beurteilt, was dann an den Verarbeitungsabschnitt 4036 gesendet und dann in der Tabelle 4036a gespeichert wird. Wenn sich ferner die vorgehende Zeichenfolge nicht am Ende des Satzes befindet, wird sie in dem Verarbeitungsabschnitt 4036 als ein Eigenname abgegeben, dessen Merkmalsinformation unbekannt ist und wird in der Tabelle 4036a registriert, wie später noch beschrieben wird.The proper name processing section 4024 has a preceding sentence end processing section, a preceding proper name processing section 4028, a proper name processing section 4030 itself, a previous proper name processing section 4032 and proper name, and a default setting section 4034, respectively. or standard feature information. The section 4026 judges whether or not the string preceding the string retrieved by the section 4022 and input from the section 4022 is at the end of the sentence. When the previous string is at the end of the sentence, it capitalizes the capital letter at the beginning of the string to be processed, passes it to dictionary retrieval section 4022 , and causes section 4022 to retrieve the reference dictionary 4020 . The string which has not been recalled even by the second fetching operation is judged to be a non-registered proper name, which is then sent to the processing section 4036 and then stored in the table 4036a . Further, if the preceding character string is not at the end of the sentence, it is delivered in the processing section 4036 as a proper name whose feature information is unknown and is registered in the table 4036 a, as will be described later.

Der Abschnitt 4028 zergliedert die Merkmalsinformation für die vorhergehende Zeichenfolge von dem Abschnitt 4026 und gibt das Ergebnis an den Verarbeitungsabschnitt 4030 für den Eigennamen an sich ab. Der Abschnitt 4030 prüft die Merkmalsinformation des zu zergliedernden Eigennamens, wie später beschrieben wird. Wenn die Merkmalsinformation für entweder den Eigennamen oder den vorhergehenden Eigennamen nicht registriert ist, zergliedert sie den Eigennamen und den vorhergehenden Eigennamen kollektiv mit Hilfe der registrierten Information für den anderen von ihnen und bewahrt ihn in der Tabelle 4036a im Abschnitt 4036 auf.The section 4028 dissects the feature information for the previous character string from the section 4026, and outputs the result to the proper name processing section 4030 itself. The section 4030 checks the feature information of the proper name to be decomposed, as described later. If the feature information is not registered for either the proper name or the previous proper name, it collectively dissects the proper name and the previous proper name using the registered information for the other of them and stores it in the table 4036a in the section 4036 .

Der Verarbeitungsabschnitt 4032 für den vorhergehenden Eigennamen und den Eigennamen an sich überprüft den Teil, welcher der Merkmalsinformation für den Eigennamen und für den vorhergehenden zu zergliedernden Eigennamen gemeinsam ist, zergliedert diese Eigennamen bezüglich des gemeinsamen Teils, gibt das Ergebnis an den Verarbeitungsabschnitt 4036 ab und speichert ihn in dessen Tabelle 4036a. Der Abschnitt 4036 schafft eine Merkmalsinformation zu einem Eigennamen, nachdem er aus der Tabelle 4036a ausgelesen worden ist, über den Abschnitt 4016 an den Abschnitt 4022 gesendet worden ist und als Ergebnis des Abfragens des Wörterbuchs 4020 in dem Abschnitt 4022 herausgefunden worden ist, daß keine Merkmalsinformation vorliegt. Da es effektiv ist, daß ein Eigennamen in Abhängigkeit von den Anwendungsfällen verschiedene Arten von Merkmalen hat, sind alle Merkmalsinformationen vorgesehen, die notwendigerweise in Betracht zu ziehen sind. Beispielsweise sind "Person, Ort, Gruppe u. ä." vorgesehen. Nachdem der Eigenname mit der Merkmalsinformation versehen ist, sendet der Abschnitt 4034 die Daten an den Verarbeitungsabschnitt 4036 und speichert sie in dessen Tabelle 4036a.As such, the previous proper name processing section 4032 and proper name itself checks the part common to the feature information for the proper name and the preceding proper name to be dissected, dissects these proper names with respect to the common part, outputs the result to the processing section 4036 , and stores him in his table 4036 a. The section 4036 provides feature information on a proper name after having been read from the table 4036 a, sent over the section 4016 to the section 4022 , and found out as the result of querying the dictionary 4020 in the section 4022 that no Feature information is present. Since it is effective that a proper name has different kinds of features depending on the use cases, all feature information is provided, which are necessarily to be considered. For example, "person, place, group, and the like." intended. After the proper name is provided with the feature information, the section 4034 sends the data to the processing section 4036 and stores it in its table 4036a .

Der Abschnitt 4036 mit der Tabelle 4036a speichert die Daten von dem Abschnitt 4032 für den vorhergehenden Eigennamen und den Eigennamen an sich, den Abschnitt 4034 und den Abrufabschnitt 4022 in die Wörterbuch-Informationskonservierungstabelle 4036a und liest danach die gespeicherten Daten aus und gibt sie an den Parsing-Abschnitt 4038 ab. Der Abschnitt 4038 führt die Analyse für den eingegebenen Satz durch, nachdem er einer Morphem-Analyse unterworfen worden ist und er aus der Tabelle 4036a ausgelesen ist.The section 4036 having the table 4036 a stores the data from the previous proper name and proper name section 4032 , the section 4034, and the fetch section 4022 into the dictionary information preservation table 4036 a, and thereafter reads and outputs the stored data parsing section 4038 . The section 4038 performs the analysis for the input sentence after being subjected to morpheme analysis and read out from the table 4036 a.

Die Arbeitsweise dieser Einrichtung wird nunmehr anhand des Flußdiagramms in Fig. 32 erläutert. Zuerst wird ein von der Einrichtung 4012 eingegebener englischer Satz in den Eingabe-Verarbeitungsabschnitt 4014 gelesen (4100). Der in den Abschnitt 4014 eingelesene Satz wird in den Puffer 4014a geladen, der in den Puffer 4014a geladene, eingegebene Satz wird in den Einheiten-Aufteilabschnitt 4016 ausgelesen. Wenn der Satz eingegeben ist, liest der Abschnitt 4016 aus der Abgrenzungstabelle 4018 Abgrenzungen, um die Wörterbuch- Bezugseinheiten aufzuteilen (4102). Das heißt, Zeichen folgen, welche den eingegebenen Satz darstellen, werden nacheinander von der Oberseite bzw. dem Anfang der Zeichenfolgen aus in Abruf-Zeichenreihen als die Einheit zum Abrufen des Bezugswörterbuchs 4020 aufgeteilt, indem sie in die Teile aufgeteilt werden, wo Abgrenzungen, wie ein Zwischenraum oder ein Doppelpunkt vorhanden sind. Der Abschnitt beurteilt, ob die aufgeteilte Wörterbuch-Bezugseinheit, d. h. Abrufzeichenfolgen beendet sind oder nicht (4104), und wenn noch (nicht beendete) Abruf-Zeichenfolgen vorhanden sind, gibt er die Abruf-Zeichenfolge an den Abschnitt 4022 ab. The operation of this device will now be explained with reference to the flowchart in FIG . First, an English sentence input from the device 4012 is read into the input processing section 4014 ( 4100 ). The read section 4014 in the set is loaded in the buffer 4014 a, 4014 a buffer in the loaded, entered sentence is read in the unit Aufteilabschnitt 4016th When the sentence is entered, the section 4016 reads from the delimitation table 4018 boundaries to partition the dictionary reference units ( 4102 ). That is, characters following the input sentence are successively divided from the top of the character strings into the fetching character strings as the fetching reference dictionary unit 4020 by dividing them into the parts where demarcations such as there is a gap or a colon. The section judges whether or not the split dictionary reference unit, ie, fetch strings are finished ( 4104 ), and if there are still (unfinished) fetch strings, returns the fetch string to the section 4022 .

Wenn eine Abruf-Zeichenfolge an den Abschnitt 4022 abgegeben ist, ruft der Abschnitt 4022 das Bezugswörterbuch 4022 für die Abruf-Zeichenfolge ab (4106). Der Abschnitt beurteilt, ob die Abruf-Zeichenfolge in dem Eintrag des Bezugswörterbuchs 4020 vorhanden ist oder nicht, wie in Fig. 31 dargestellt ist (4108), und wenn ein Eintrag vorhanden ist, liest er die in dem Wörterbuch 4020 gespeicherte Sprachteilinformation aus und beurteilt, ob die Abruf-Zeichenfolge ein Eigenname ist oder nicht (4110). Wenn die Abruf-Zeichenfolge kein Eigenname ist, gibt der Abschnitt 4022 die aus dem Wörterbuch 4020 ausgelesenen Daten an den Verarbeitungsabschnitt 4036 ab und zeichnet sie in dessen Tabelle 4036a auf (4112). Wenn die Daten in der Tabelle 4036a gespeichert sind, werden der Eintrag, der anzeigt, daß die Daten in dem Abschnitt 4016 gespeichert sind, und die Daten für die gerade vorher aufbewahrte Abruf-Zeichenfolge von dem Verarbeitungsabschnitt 4036 aus eingegeben. Dann wird auf den Schritt 4102 zurückgekehrt und die Wörterbuch-Bezugseinheit wird wieder in dem Abschnitt 4016 aufgeteilt.When a fetch string is submitted to section 4022 , section 4022 retrieves fetch string reference dictionary 4022 ( 4106 ). The section judges whether the retrieval character string in the entry of the reference dictionary 4020 is present or not, as shown in Fig. 31 (4108), and if there is an entry, reads out the information stored in the dictionary 4020 part of speech information and assessed whether the fetch string is a proper name or not ( 4110 ). If the fetch string is not a proper name, the portion 4022 outputs the data read from the dictionary 4020 to the processing portion 4036 and records it in its table 4036a ( 4112 ). When the data is stored in the table 4036a , the entry indicating that the data is stored in the section 4016 and the data for the just-retrieved retrieval string are inputted from the processing section 4036 . Then, step 4102 is returned and the dictionary reference unit is again divided into the section 4016 .

Wenn beim Schritt 4110 die Abruf-Zeichenfolge ein Eigenname ist, gibt der Abschnitt 4022 die Daten für den aus dem Wörterbuch 4020 gelesenen Eigennamen, was nachstehend der Einfachheit halber als Eigenname bezeichnet wird, an den Verarbeitungsabschnitt 4024 für Eigennamen zusammen mit den Daten für die vorhergehende Abruf-Datenfolge, die von der Tabelle 4036a in dem Abschnitt 4036 eingegeben worden ist, mittels des Abschnitts 4016 an den Abschnitt 4022 ab, und die Verarbeitung für den im Wörterbuch registrierten Eigennamen wird in dem Abschnitt 4024 durchgeführt (4124).At step 4110, if the fetch string is a proper name, the portion 4022 gives the data for the proper name read from the dictionary 4020 , which is hereinafter referred to as a proper name, to the proper name processing portion 4024 along with the data for the preceding one Fetch data stream entered from the table 4036a in the section 4036 to the section 4022 via the section 4016 , and processing for the dictionary-registered proper name is performed in the section 4024 ( 4124 ).

Nunmehr wird die Verarbeitung für den im Wörterbuch registrierten Eigennamen anhand des Flußdiagramms in Fig. 33 erläutert. Die Daten, welche von dem Abschnitt 4022 an den Abschnitt 4024 abgegeben worden sind, werden mittels des Verarbeitungsabschnitts 4026 für das vorhergehende Satzende in den Verarbeitungsabschnitt 4028 für den vorhergehenden Eigennamen eingegeben. Bei der Verarbeitung des im Wörterbuch registrierten Eigennamens hat der Abschnitt 4026 keine Funktion. Der Verarbeitungsabschnitt 4028 beurteilt, ob die Eingabe, welche dem Eigennamen vorangeht, ein nicht in dem Wörterbuch 4020 registrierter Eigennamen ist oder nicht, d. h. ob er bei der Verarbeitung für den nicht im Wörterbuch registrierten Eigennamen, was später noch beschrieben wird, verwendet wird oder nicht (2100). Wenn es ein nichtregistrierter Eigenname ist, beurteilt der Verarbeitungsabschnitt den Gesamtteil des Eigennamens und den vorhergehenden, nicht registrierten Eigennamen als einen Eigennamen, welche die Merkmalsinformation für den Eigennamen hat (4202) gibt die Daten an den Verarbeitungsabschnitt 4036 ab und speichert sie in dessen Tabelle 4036a (4218).Now, the processing for the proper name registered in the dictionary will be explained with reference to the flowchart in FIG . The data output from the section 4022 to the section 4024 is input to the previous proper name processing section 4028 by the preceding sentence end processing section 4026 . When processing the proper name registered in the dictionary, section 4026 has no function. The processing section 4028 judges whether or not the input preceded by the proper name is a proper name not registered in the dictionary 4020 , that is, whether or not it is used in processing for the unregistered proper name, which will be described later ( 2100 ). If it is an unregistered proper name, the processing section judges the whole part of the proper name and the previous unregistered proper name as a proper name having the feature information for the proper name ( 4202 ) outputs the data to the processing section 4036 and stores it in its table 4036 a ( 4218 ).

Wenn der Abschnitt 4028 beim Schritt 4200 beurteilt, daß die dem Eigennamen vorangehende Eingabe nicht ein registrierter Eigenname ist beurteilt sie anschließend, ob der dem Eigennamen vorangehende Eintrag ein Eigenname ist, der in dem Wörterbuch 4020 registriert ist oder nicht (4204). Wenn der dem Eigennamen gerade vorangehende Eintrag ein registrierter Eigenname ist, wird beurteilt, ob die Merkmalsinformation für den vorhergehenden Eigennamen bekannt ist oder nicht, d. h. ob er in dem Wörterbuch 4022 registriert ist, oder nicht (4206).If the section 4028 judges in step 4200 that the input preceding the proper name is not a registered proper name, it then judges whether the entry preceding the proper name is a proper name registered in the dictionary 4020 or not ( 4204 ). If the entry just preceded by the proper name is a registered proper name, it is judged whether or not the feature information for the previous proper name is known, that is, whether it is registered in the dictionary 4022 ( 4206 ).

Wenn der Merkmalsinformation für den vorhergehenden Eigennamen unbekannt ist, wird auf den Schritt 4202 übergegangen, um den gesamten Anteil des Eigennamens an sich und den diesem unmittelbar vorangehenden Eigennamen als einen Eigennamen mit einer Merkmalsinformation zu beurteilen (4202); der Verarbeitungsabschnitt 4028 für den vorhergehenden Eigennamen gibt dann die Daten an den Verarbeitungsabschnitt 4036 ab, in dessen Tabelle 4036a dann die abgegebenen Daten aufgezeichnet werden (4218).If the feature information for the previous proper name is unknown, step 4202 is proceeded to judge ( 4202 ) the entire portion of the proper name per se and the immediately preceding proper name as a proper name with feature information ; the previous proper name processing section 4028 then outputs the data to the processing section 4036 , in the table 4036 a of which the output data is recorded ( 4218 ).

Wenn der Verarbeitungsabschnitt 4028 beim Schritt 4206 beurteilt, daß die Merkmalsinformation für den vorhergehenden Eigennamen nicht unbekannt ist, d. h. daß er in dem Wörterbuch 4020 registriert ist, werden die Daten von dem Abschnitt 4028 an den Verarbeitungsabschnitt 4030 abgegeben. Der Abschnitt 4030 beurteilt dann, ob die Merkmalsinformation des Eigennamens unbekannt ist oder nicht (4208). Wenn die Merkmalsinformation für den Eigennamen unbekannt ist, beurteilt der Abschnitt 4030 den den gesamten Anteil des Eigennamens an sich und den diesem unmittelbar vorhergehenden Eigennamen als einen Eigennamen, der eine Merkmalsinformation des Eigennamens hat (4210) und gibt die Daten an den Verarbeitungsabschnitt 4036 ab, in dessen Tabelle 4036a die Daten aufgezeichnet werden.When the processing section 4028 judges in step 4206 that the feature information for the previous proper name is not unknown, ie, that it is registered in the dictionary 4020 , the data is output from the section 4028 to the processing section 4030 . The section 4030 then judges whether the feature information of the proper name is unknown or not ( 4208 ). When the feature information for the proper name is unknown, the section 4030 judges the whole portion of the own name per se and the immediately preceding proper name as a proper name having feature information of the proper name ( 4210 ) and outputs the data to the processing portion 4036 , Table 4036 a records the data.

Wenn der Verarbeitungsabschnitt 4030 feststellt, daß die Merkmalsinformation des Eigennames an sich nicht unbekannt ist, das heißt, daß er in dem Wörterbuch 4020 registriert ist, gibt der Abschnitt 4030 die Daten an den Verarbeitungsabschnitt 4032 für den vorhergehenden Eigennamen und den Eigennamen an sich ab. Der Verarbeitungsabschnitt 3032 beurteilt dann, ob irgendein gemeinsames Merkmal in der Merkmalsinformation zwischen dem Eigennamen an sich und dem diesem unmittelbar vorangehenden Eigennamen vorhanden ist (4212). Wenn ein gemeinsames Merkmal vorhanden ist, beurteilt er den gesamten Anteil für den Eigennamen an sich und für den unmittelbar vorangehenden Eigennamen als einen Eigennamen, der die Merkmalsinformation des gemeinsamen Teils hat (4214), und gibt die Daten an dem Verarbeitungsabschnitt 4036 ab, in dessen Tabelle 4036a die Daten dann aufgezeichnet werden (4218).When the processing section 4030 determines that the feature information of the self-name per se is not unknown, that is, that it is registered in the dictionary 4020 , the section 4030 gives the data to the preceding proper name processing section 4032 and the proper name itself. The processing section 3032 then judges whether any common feature exists in the feature information between the proper name itself and the proper preceding name ( 4212 ). If there is a common feature, it judges the whole share for the proper name per se and for the immediately preceding proper name as a proper name having the feature information of the common part ( 4214 ), and outputs the data to the processing section 4036 in which Table 4036a , the data is then recorded ( 4218 ).

Wenn es kein gemeinsames Merkmal für die Merkmalsinformation zwischen dem Eigennamen an sich und dem diesem unmittelbar vorangehenden Eigennamen gibt, beurteilt der Verarbeitungsabschnitt, daß der Eigenname ein Eigenname mit der Merkmalsinformation ist, welche aus dem Speicher 4020 abgerufen worden ist, die sich von dem unmittelbar vorangehenden Eigennamen unterscheidet (4216), und gibt die Daten an den Verarbeitungsabschnitt 4036 ab, in dessen Tabelle 4036a die Daten dann aufgezeichnet werden (4218). Wenn in Fig. 32 beim Schritt 4108 keine Abrufzeichenfolge in dem Eintrag des Bezugswörterbuchs 4020 vorhanden ist, wird beurteilt, ob das erste Zeichen der Abruf-Zeichenfolge ein großgeschriebener Buchstabe ist oder nicht, und wenn es kein großgeschriebener Buchstabe ist, betrachtet der Abschnitt 4022 die Abrufzeichenfolge als ein nicht registriertes Wort, gibt sie an den Verarbeitungsabschnitt 4036 ab, damit sie in dessen Tabelle 4036a aufgezeichnet wird (4118).If there is no common feature information characteristic between the proper name itself and the proper name immediately preceding it, the processing section judges that the proper name is a proper name with the feature information retrieved from the memory 4020 , which is different from the immediately preceding one Proper names discriminate ( 4216 ), and outputs the data to the processing section 4036 in whose table 4036 a the data is then recorded ( 4218 ). In FIG. 32, if there is no retrieval string in the entry of the reference dictionary 4020 in step 4108 , it is judged whether or not the first character of the retrieval string is a capitalized letter, and if it is not a capitalized letter, the portion 4022 regards The retrieval string as an unregistered word is sent to the processing section 4036 to be recorded in its table 4036 a ( 4118 ).

Wenn das erste Zeichen ein großgeschriebener Buchstabe ist, werden die Daten für die Abruf-Zeichenfolge zusammen mit den Daten für die vorangehende Abruf-Zeichenfolge von dem Abschnitt 4022 in den Abschnitt für Eigennamen gegeben, wo die Verarbeitung für den nicht im Wörterbuch registrierten Eigennamen durchgeführt wird (4120).If the first character is a capitalized letter, the data for the fetch string, along with the data for the preceding fetch string, is passed from the portion 4022 to the proper name portion where the processing for the non-dictionary registered proper name is performed ( 4120 ).

Nunmehr wird die Verarbeitung von nicht im Wörterbuch registrierten Eigennamen anhand von Fig. 34 beschrieben. Die Daten für die Abruf-Zeichenfolge werden zusammen mit den Daten für die vorhergehende Abruf-Zeichenfolge an den Verarbeitungsabschnitt 4026 abgegeben, der dann beurteilt, ob das Ende der vorhergehenden Eingabe ein Kandidat für das Ende des Satzes ist oder nicht (4300). Die Beurteilung, ob es ein Kandidat für das Ende des Satzes ist oder nicht, wird mittels der Beurteilung durchgeführt, ob das Ende der vorhergehenden Eingabe ein Kandidat für das Ende des Satzes ist, wie beispielsweise ein gesonderter Punkt (.), usw. oder nicht.Now, the processing of non-dictionary registered proper names will be described with reference to FIG . The data for the fetch string is output to the processing section 4026 along with the data for the previous fetch string, which then judges whether the end of the previous input is a candidate for the end of the sentence or not ( 4300 ). The judgment as to whether or not it is a candidate for the end of the sentence is made by judging whether the end of the previous input is a candidate for the end of the sentence, such as a separate point (.), Etc. or not ,

Wenn das Ende des vorhergehenden Eingangs ein Kandidat für das Ende des Satzes ist, werden Daten von dem Verarbeitungsabschnitt 4026 an den Verarbeitungsabschnitt 4028 abgegeben, welcher dann die vorhergehende Eingabe als das Ende des Satzes betrachtet (4302), das erste Zeichen in der Abruf- Zeichenfolge in einen kleinen Buchstaben umwandelt und ihn an den Abschnitt 4022 abgibt.If the end of the previous input is a candidate for the end of the sentence, data is output from processing section 4026 to processing section 4028 , which then considers the previous input as the end of the sentence ( 4302 ), the first character in the fetch string converted to a small letter and sent to section 4022 .

Der Abschnitt 4022 ruft das Wörterbuch 4020 für die Abruf- Zeichenfolge ab, die in das kleingeschriebene Zeichen umgewandelt worden ist (4304) und beurteilt, ob es einen Eintrag in dem Bezugswörterbuch 4020 gibt oder nicht (4306). Wenn es einen Eintrag gibt, gibt der Abschnitt 4022 die aus dem Wörterbuch 4020 abgerufenen Daten an den Verarbeitungsabschnitt 4036 ab und speichert sie in dessen Tabelle 4036a (4308). Wenn es keinen Eintrag gibt, stellt der Abschnitt 4022 das erste Zeichen in der Abruf-Zeichenfolge wieder auf das großgeschriebene Zeichen um, gibt dasselbe als einen nichtregistrierten Eigennamen an den Verarbeitungsabschnitt 4036 ab, wodurch er in dessen Tabelle 4036a aufgezeichnet wird (4310).The section 4022 retrieves the dictionary 4020 for the retrieval string that has been converted into the lowercase characters (4304), and judges whether or not there is an entry in the reference dictionary 4020 is not (4306). If there is an entry, the section 4022 outputs the data retrieved from the dictionary 4020 to the processing section 4036 and stores it in its table 4036a ( 4308 ). If there is no entry, the section 4022 restores the first character in the fetch string to the uppercase character, issues it as a non-registered proper name to the processing section 4036 , thereby recording it in its table 4036a ( 4310 ).

Wenn beim Schritt 4300 der Verarbeitungsschnitt 4026 für das vorhergehende Satzende beurteilt, daß das Ende des vorhergehenden Eintrags kein Kandidat für das Ende des Satzes ist, werden die Daten von dem Abschnitt 4026 an den Abschnitt 4038 angegeben, der dann den vorhergehenden Eintrag so beurteilt, daß er nicht das Ende des Satzes ist (4312). Die Daten von dem Abschnitt 4028 werden an den Abschnitt 4030 abgegeben, welcher dann die Abruf-Zeichenfolge als einen Eigennamen betrachtet, dessen Merkmalsinformation unbekannt ist (4314).If, at step 4300, the previous sentence end processing intersection 4026 judges that the end of the previous entry is not a candidate for the end of the sentence, the data from section 4026 is presented to section 4038 , which then judges the previous entry such that he is not the end of the sentence ( 4312 ). The data from section 4028 is passed to section 4030 , which then considers the fetch string as a proper name whose feature information is unknown ( 4314 ).

Der Abschnitt 4030 bringt dann die Daten in den Verarbeitungsabschnitt 4028 zurück, welcher dann die Verarbeitung für den im Wörterbuch registrierten Eigennamen durchführt (4316). Die Verarbeitung für den im Wörterbuch registrierten Eigennamen ist dieselbe wie diejenige, welche in Fig. 33 dargestellt ist.The section 4030 then returns the data to the processing section 4028 , which then performs the processing for the dictionary-registered proper name ( 4316 ). The processing for the proper name registered in the dictionary is the same as that shown in FIG .

Wenn nunmehr in Fig. 32 die Referenzeinheit beim Schritt 4104 am Ende ist, gibt der Abschnitt 4022 ein Signal ab, welches dies dem Abschnitt 4034 anzeigt, welcher dann die in der Tabelle 4036a im Abschnitt 4036 aufgezeichnete Information ausliest und den Eigennamen mit der Vorgabe-Merkmalsinformation schafft (4122).Now, in FIG. 32, when the reference unit is at step 4104 , the section 4022 outputs a signal indicating this to section 4034 , which then reads out the information recorded in table 4036a in section 4036 and the proper name with the default Feature information ( 4122 ).

Nunmehr wird anhand von Fig. 35 die Arbeitsweise beschrieben, um den Eigennamen mit der Vorgabe-Merkmalsinformation zu versehen. In dem hierfür vorgesehenen Abschnitt 4034 wird zuerst ein Zeiger an die Oberseite der Daten in der Tabelle 4036a gesetzt (4400). Das heißt, der Zeiger wird an den Eintrag an die Oberseite des Eingabesatzes gesetzt, welcher in Einträge aufgeteilt sind, welche jeweils mit Information durch das Abrufen in dem Bezugswörterbuch 4020 versehen sind. Es wird dann beurteilt, ob der durch den Zeiger aufgezeigte Eintrag ein Eigenname ist oder nicht (4402); wenn es ein Eigenname ist, wird beurteilt, ob die Merkmalsinformation des Eigennamens unbekannt ist oder nicht (4404). Wenn es kein Eigenname ist, wird im Flußdiagramm auf den Schritt (4408) übergegangen, und der Zeiger auf den nächsten Eintrag vorgerückt.Now, with reference to Fig. 35, the operation will be described to provide the proper name with the default feature information. In the dedicated section 4034 , a pointer is first set to the top of the data in the table 4036 a ( 4400 ). That is, the pointer is placed at the top of the input sentence, which is divided into entries each provided with information by the retrieval in the reference dictionary 4020 . It is then judged whether the entry indicated by the pointer is a proper name or not ( 4402 ); if it is a proper name, it is judged whether the feature information of the proper name is unknown or not ( 4404 ). If it is not a proper name, the flowchart proceeds to step ( 4408 ) and the pointer advances to the next entry.

Wenn beim Schritt 4404 die Merkmalsinformation für den Eigennamen unbekannt ist, wird die Vorgabe-Merkmalsinformation vorgesehen (4406). Die Vorgabe-Merkmalsinformation wird bei einem Eigennamen vorgesehen, dessen Merkmalsinformation unbekannt ist, wie im unteren Teil der Fig. 36 dargestellt ist. Dort wird dann beispielsweise der Eigenname "Johnson" dessen Merkmalsinformation unbekannt ist, mit allen Arten von Merkmalsinformationen versehen, d. h. "Person, Ort, Gruppe und anderes". Dadurch, daß der Eigenname, dessen Merkmalsinformation unbekannt ist, mit allen Arten von Merkmalsinformationen versehen wird, ist es möglich, einen Raum übrig zu lassen, damit der Eigenname in eine der Vielzahl Merkmale in der anschließenden Strukturanalyse zerlegt werden kann.If the feature information for the proper name is unknown at step 4404 , the default feature information is provided ( 4406 ). The default feature information is provided at a proper name whose feature information is unknown, as shown in the lower part of FIG . There, then, for example, the proper name "Johnson" whose feature information is unknown is provided with all kinds of feature information, ie, "person, place, group and others". By providing the proper name whose feature information is unknown with all kinds of feature information, it is possible to leave a space so that the proper name can be decomposed into one of the plurality of features in the subsequent texture analysis.

Wenn beim Schritt 4404 die Merkmalsinformation für den Eigennamen nicht bekannt ist, wird im Flußdiagramm auf den Schritt 4408 vorgerückt, und der Zeiger wird auf einen weiteren Eintrag vorgerückt. Es wird dann beurteilt, ob der durch den Zeiger angezeigte Eintrag am Ende liegt oder nicht (4408), und wenn er nicht am Ende liegt, wird im Flußdiagramm auf den Schritt 4402 zurückgekehrt, und es wird dann der nächste Eintrag überprüft, ob er ein Eigenname ist oder nicht. Wenn der durch den Zeiger aufgezeigte Eintrag am Eingang liegt, wird das Vorsehen der Vorgabeinformation beendet. Nachdem das Vorsehen der Vorgabe-Merkmalsinformation an dem Eigennamen beendet ist, werden die Daten, die in der Tabelle 4036a aufgezeichnet sind, von dem Abschnitt 4036 an den Abschnitt 4036 für syntaktische Analyse abgegeben (4124), wodurch dann die Morphem-Analyse in dieser Ausführung beendet ist.If the feature information for the proper name is not known at step 4404 , the flowchart advances to step 4408 and the pointer is advanced to another entry. It is then judged whether the entry indicated by the pointer is at the end or not ( 4408 ), and if it is not at the end, the flow returns to step 4402 , and then the next entry is checked to see if it is Proper name is or not. If the entry indicated by the pointer is at the input, the provision of the default information is terminated. After the provision of the default feature information on the proper name is completed, the data recorded in the table 4036 a is output from the section 4036 to the syntactic analysis section 4036 ( 4124 ), whereby the morphemic analysis in this Execution is finished.

Nunmehr wird die Arbeitsweise der vorstehend beschriebenen Einrichtung beispielsweise an einem eingegebenen Satz erläutert. Die Erläuterung erfolgt für den Fall, daß ein Satz "am Bahnhof Tokyo traf mr. Walter Johnson" eingegeben wird. Zuerst wird eine Eingangsverarbeitung (4100) durchgeführt, um den eingegebenen Satz in den Verarbeitungsabschnitt 4014 einzulesen. Dann wird die Wörterbuch-Bezugseinheit aufgeteilt (4102) und der eingegebene Satz wird durch Zwischenräume jeweils in Worte aufgeteilt. Das Bezugswörterbuch 4020 wird zuerst bezüglich "auf" ("In") abgerufen (4106). Da es keinen Eintrag für "Auf" gibt, das sich in dem Bezugswörterbuch 4020 befindet, geht der Schritt zuerst auf die Verarbeitung für den im Wörterbuch nicht registrierten Eigennamen über. Da jedoch der vorhergehende Teil an der Oberseite der eingegebenen Zeichenfolge liegt, wird er als ein Kandidat für das Ende des Satzes betrachtet, der als "auf bzw. in" abgerufen worden ist, indem I bzw. A in a bzw. i aus dem Bezugswörterbuch 4020 umgesetzt wird. Da es keinen Eigennamen gibt (4110) werden die aus dem Wörterbuch 4020 abgerufenen Daten in der Tabelle 4036a aufgezeichnet (4112). Dann wird das Wörterbuch 4020 bezüglich "Tokyo" abgerufen (4106). Da es keinen Eintrag in dem Wörterbuch 4020 für "Tokyo" gibt (4108) und das erste Zeichen ein großer Buchstabe ist (4116), wird eine Verarbeitung für den nicht im Wörterbuch registrierten Eigennamen durchgeführt (4120). Dann wird im Flußdiagramm auf Fig. 34 vorgerückt. Da der vorhergehende Teil "Auf bzw. In" ist, ist dies kein Kandidat für das Ende des Satzes (4300); "Auf" bzw. "In" wird nicht als das Ende des Satzes beurteilt (4312) "Tokyo" wird als ein Eigenname erkannt, dessen Merkmalsinformation unbekannt ist (4314) und es wird die Verarbeitung für den im Wörterbuch registrierten Eigennamen durchgeführt (4316).Now, the operation of the above-described device will be explained, for example, on an input sentence. The explanation will be made in the case where a phrase "met at Tokyo Station Mr. Walter Johnson" is entered. First, an input processing ( 4100 ) is performed to read the input sentence into the processing section 4014 . Then, the dictionary reference unit is divided ( 4102 ), and the input sentence is divided into spaces by spaces. The reference dictionary 4020 is first fetched for "on"("In") ( 4106 ). Since there is no entry for "Up" located in the reference dictionary 4020 , the step first transfers to the processing for the proper name not registered in the dictionary. However, since the previous part is at the top of the input string, it is considered to be a candidate for the end of the sentence which has been retrieved as "on" by adding I and A in a and i, respectively, from the reference dictionary 4020 is implemented. Since there is no proper name ( 4110 ), the data retrieved from the dictionary 4020 is recorded in the table 4036a ( 4112 ). Then, the dictionary 4020 is retrieved for "Tokyo" ( 4106 ). Since there is no entry in dictionary 4020 for "Tokyo" ( 4108 ) and the first character is a capital letter ( 4116 ), processing is performed for the non-dictionary registered proper name ( 4120 ). Then, in the flowchart, FIG. 34 is advanced. Since the previous part is "up or in", this is not a candidate for the end of the sentence ( 4300 ); "On" is not judged as the end of the sentence ( 4312 ) "Tokyo" is recognized as a proper name whose feature information is unknown ( 4314 ), and the processing for the proper name registered in the dictionary is performed ( 4316 ) ,

Im Flußdiagramm wird auf Fig. 33 vorgerückt. Da "Auf bzw. In" in dem vorhergehenden Teil weder ein nichtregistrierter (4200) noch ein registrierter Eigenname (4204) ist, wird er als ein Eigenname, der an sich eine ihm eigene Merkmalsinformation hat, d. h. als ein Eigenname aufgezeichnet, dessen Merkmalsinformation unbekannt ist (4216). In Fig. 32 wird dann das Abrufen für das Wörterbuch 4020 für "Bahnhof" ("Station") durchgeführt (4108). Da es einen Eintrag in dem Wörterbuch 4020 für "Bahnhof" gibt (4108), und da es ein Eigenname ist (4110) wird eine Verarbeitung für den im Wörterbuch registrierten Eigennamen durchgeführt (4114). Es wird nun auf das Flußdiagramm der Fig. 33 übergegangen. Da "Tokyo" in dem vorhergehenden Teil ein nicht registrierter Eigenname ist (4200), wird "Bahnhof Tokyo" ("Tokoy Station") als Ganzes als ein Eigenname aufgezeichnet, welcher die Merkmalsinformation "Ort und Gruppe" in Form des Ausdrucks "Bahnhof" (bzw. "Station") hat (4202). Dann wird das Bezugswörterbuch 4020 in Fig. 32 für "Mr." abgerufen (4106), da "Mr." ein Eintrag in dem Wörterbuch 420 und ein Eigenname ist (4110), wird die Verarbeitung für den im Wörterbuch registrierten Eigennamen durchgeführt (4114). Dann rückt der Fluß auf Fig. 33 vor. "Bahnhof" ("Station") in dem vorherigen Teil ist nicht ein nichtregistrierter Eigenname (4200), sondern ein registrierter Eigenname (4204) und die Merkmalsinformation (Ort, Gruppe) sind nicht unbekannt (4206). Da "Mr." eine Merkmalsinformation "Person" ist und unbekannt ist (4208) wird beurteilt, ob irgendein gemeinsamer Teil in der Merkmalsinformation "Bahnhof" in dem vorhergehenden Teil und "Mr." vorhanden ist (4212). Da "Bahnhof" "Ort, Gruppe" bedeutet, während "Mr." "Person" bedeutet und es keinen gemeinsamen Teil zwischen ihnen gibt, wird "Mr." allein als ein Eigenname mit der Merkmalsinformation "Person" registriert. (4216). Es wird dann wieder auf das Flußdiagramm in Fig. 32 zurückgegangen, und das Bezugswörterbuch 4020 wird für "Walter" abgerufen (4106). Da es einen Eintrag für "Walter" in dem Wörterbuch 4020 gibt (4108) und dies ein Eigenname ist (4110) wird eine Verarbeitung für den registrierten Eigennamen durchgeführt (4114). Es wird dann wieder auf das Flußdiagramm der Fig. 33 übergegangen. Da "Mr." in dem vorhergehenden Teil nicht ein nichtregistrierter Eigenname (4200) sondern ein registrierter Eigenname ist (4204), und da der Merkmalsinformation "Person" nicht unbekannt ist (4206) und ferner die Merkmalsinformation für "Walter" "Person, Ort, Gruppe" ebenfalls nicht unbekannt ist, wird der gemeinsame Teil für die Merkmalsinformation beurteilt (4212). Da es einen eigenen Teil ("Person") gibt, der für die Merkmalsinformation vorhanden ist, werden "Mr. Walter" gemeinsam als ein gemeinsames Hauptwort mit der Merkmalsinformation "Person" aufgezeichnet (4214).In the flowchart, Fig. 33 is advanced. Since "Up or In" in the preceding part is neither a non-registered ( 4200 ) nor a registered proper name ( 4204 ), it is recorded as a proper name having inherent feature information per se, ie, as a proper name whose feature information is unknown is ( 4216 ). In Fig. 32, retrieval is then performed for dictionary 4020 for "station" ( 4108 ). Since there is an entry in the dictionary 4020 for "station" ( 4108 ), and since it is a proper name ( 4110 ), processing for the proper name registered in the dictionary is performed ( 4114 ). Turning now to the flowchart of FIG . Since "Tokyo" in the preceding part is an unregistered proper name ( 4200 ), "Tokyo Station"("TokoyStation") as a whole is recorded as a proper name containing the feature information "Location and Group" in the form of the term "Station". (or "station") has ( 4202 ). Then, the reference dictionary 4020 in Fig. 32 for "Mr." retrieved ( 4106 ) because "Mr." an entry in the dictionary 420 and a proper name is ( 4110 ), processing for the proper name registered in the dictionary is performed ( 4114 ). Then the flow advances to Fig. 33. "Station" in the previous part is not a non-registered proper name ( 4200 ) but a registered proper name ( 4204 ) and the feature information (place, group) is not unknown ( 4206 ). Since "Mr." is a feature information "person" and is unknown ( 4208 ), it is judged whether any common part in the feature information "station" in the preceding part and "Mr." is present ( 4212 ). Since "station" means "place, group" while "Mr.""Person" means and there is no common part between them, "Mr." alone registered as a proper name with the feature information "person". ( 4216 ). Returning to the flowchart of Fig. 32, the reference dictionary 4020 is fetched for "Walter" ( 4106 ). Since there is an entry for "Walter" in the dictionary 4020 ( 4108 ) and this is a proper name ( 4110 ), processing for the registered proper name is performed ( 4114 ). It is then transferred back to the flowchart of FIG. 33. Since "Mr." in the preceding part is not a non-registered proper name ( 4200 ) but a registered proper name ( 4204 ), and since the feature information "person" is not unknown ( 4206 ) and furthermore the feature information for "Walter""person, place, group" also does not is unknown, the common part for the feature information is judged ( 4212 ). Since there is a separate part ("person") present for the feature information, "Mr. Walter" is recorded collectively as a common noun with the feature information "person" ( 4214 ).

Dann wird "traf" aus dem Wörterbuch 420 abgerufen. Und da es eine Eintragung gibt (4108) und es kein Eigenname ist (4110) werden die aus dem Wörterbuch 4020 aufgezeichneten Daten in der Tabelle 4036a aufgezeichnet (4112). Ferner wird "Johnson" aus dem Wörterbuch 4020 abgerufen (4106). Da es keinen Eintrag für "Johnson" gibt (4108) und das erste Zeichen ein groß geschriebenes Zeichen ist (4116) wird eine Verarbeitung für den nicht im Wörterbuch registrierten Eigennamen durchgeführt (4120). Dann wird auf das Flußdiagramm der Fig. 34 übergegangen. Da "traf" in dem vorgehenden Teil keinen Kandidaten für das Ende des Satzes hat (4300), wird beurteilt, ob "traf" nicht das Ende des Satzes ist (4312), "Johnson" wird als Eigenname mit unbekannter Merkmalsinformation betrachtet (4314) und es wird eine Verarbeitung für den im Wörterbuch registrierten Eigennamen durchgeführt (4316). Dann wird wieder auf das Flußdiagramm der Fig. 33 übergegangen. Da "traf" in dem vorhergehenden Teil weder als nichtregistrierter (4200) noch als registrierter Eigenname (4204) vorhanden ist, wird "Johnson" allein als ein Eigenname aufgezeichnet, dessen Merkmalsinformation unbekannt ist.Then "hit" is retrieved from the dictionary 420 . And since there is an entry ( 4108 ) and it is not a proper name ( 4110 ), the data recorded from the dictionary 4020 is recorded in the table 4036a ( 4112 ). Further, "Johnson" is retrieved from dictionary 4020 ( 4106 ). Since there is no entry for "Johnson" ( 4108 ) and the first character is a capitalized character ( 4116 ), processing is performed for the proper name not registered in the dictionary ( 4120 ). Then, the flowchart of Fig. 34 is gone. Since "hit" in the preceding part has no candidate for the end of the sentence ( 4300 ), it is judged whether "hit" is not the end of the sentence ( 4312 ), "Johnson" is considered a proper name with unknown feature information ( 4314 ) and processing for the proper name registered in the dictionary is performed ( 4316 ). Then, again, the flowchart of Fig. 33 is gone. Since "hit" is present in the previous part as neither a non-registered ( 4200 ) nor a registered proper name ( 4204 ), "Johnson" alone is recorded as a proper name whose feature information is unknown.

Nach den vorhergehenden Verarbeitungsschritten wird eine Vorgabe-Merkmalsinformation für den Eigennamen vorgesehen, wie in Fig. 35 dargestellt ist. Der Zeiger wird auf "Auf" ("In") als das obere Ende der Wörterbuch-Bezugseinheit gesetzt (4400). Da es kein Eigenname ist (4402), wird der Zeiger vorgerückt (4408) und auf "Bahnhof Tokyo" ("Tokyo Station") gesetzt. Da "Bahnhof Tokyo" ein Eigenname ist (4402) und die Merkmalsinformation nicht unbekannt ist, da der Gesamtteil "Bahnhof Tokyo" als Ort, Gruppe bei der vorherigen Verarbeitung für den registrierten Eigennamen erkannt worden ist (4404) wird der Zeiger vorgerückt (4408) und auf "Mr. Walter" gesetzt.After the foregoing processing steps, default characteristic information is provided for the proper name, as shown in FIG . The pointer is set to "up"("In") as the top of the dictionary reference unit ( 4400 ). Since it is not a proper name ( 4402 ), the pointer is advanced ( 4408 ) and set to "Tokyo Station". Since "Tokyo Station" is a proper name ( 4402 ) and the feature information is not unknown, since the total part "Tokyo Station" has been recognized as a place, group in the previous processing for the registered proper name ( 4404 ), the pointer is advanced ( 4408 ) and set to "Mr. Walter".

Da "Mr. Walter" auch ein Eigenname ist (4402) und die Merkmalsinformation nicht unbekannt ist (4404) wird der Zeiger vorgerückt (4408). Da "traf" kein Eigenname ist (4402) wird der Zeiger vorgerückt (4408). Da "Johnson" ein Eigenname ist (4402), dessen Merkmalsinformation unbekannt ist, wird eine Vorgabe-Merkmalsinformation vorgesehen (4406) und "Johnson" wird mit einer Merkmalsinformation "Person, Ort, Gruppe u. ä." versehen, wie in Fig. 36 dargestellt ist.Since "Mr. Walter" is also a proper name ( 4402 ) and the feature information is not unknown ( 4404 ), the pointer is advanced ( 4408 ). Since "hit" is not a proper name ( 4402 ), the pointer is advanced ( 4408 ). Since "Johnson" is a proper name ( 4402 ) whose feature information is unknown, default feature information is provided ( 4406 ) and "Johnson" is provided with feature information "person, place, group, and the like." provided as shown in Fig. 36.

Wie vorstehend in dieser Ausführungsform beschrieben wird, wird ein deutscher (englischer) eingegebener Satz in Abruf- Zeichenfolgen unterteilt, welche dann aus dem Bezugswörterbuch 4020 abgerufen werden, und wenn es einen Eintrag als Eigennamen in dem Wörterbuch 4020 gibt, wird eine Verarbeitung für den registrierten Eigennamen durchgeführt. Bei der Verarbeitung des registrierten Eigennamens wird die vorhergehende Abruf-Zeichenfolge berücksichtigt, und wenn sie ein Eigenname ist, werden die Merkmalsinformationen für die vorhergehende Abruf-Zeichenfolge und den Eigennamen als das Objekt überprüft. Wenn es keine Merkmalsinformation hat, wird die andere Merkmalsinformation vorgesehen, während, wenn irgendeine Merkmalsinformation für beides vorhanden ist, wird der gemeinsame Teil als die Merkmalsinformation dieser Eigennamen betrachtet. Folglich ist es möglich, einen Eigennamen, der keine Merkmalsinformation hat in geeigneter Weise mit einer Merkmalsinformation zu versehen und die vorgesehene Merkmalsinformation auf eine geeignetere Merkmalsinformation zu beschränken. Dies ermöglicht eine wirksamere Analyse in der nachfolgenden Strukturanalyse und beim Durchführen einer entsprechenden Übersetzung.As described above, in this embodiment, a German (English) input sentence in fetch divided strings, which are then retrieved from the reference dictionary 4020, and if there is an entry as a proper name in the dictionary 4020, a processing for the registered Proper names performed. In the processing of the registered nickname, the previous fetching string is taken into account, and if it is a proper name, the feature information for the previous fetching string and the proper name are checked as the object. If it has no feature information, the other feature information is provided, while if any feature information exists for both, the common part is considered as the feature information of those proper names. Consequently, it is possible to suitably provide a proper name having no feature information with feature information and limit the provided feature information to more appropriate feature information. This allows a more efficient analysis in the subsequent structural analysis and in performing a corresponding translation.

Wenn ferner für die Zeichenfolge, die nicht in dem Wörterbuch 4020 registriert ist, das erste Zeichen ein groß geschriebenes Zeichen ist und festgestellt wird, daß die vorhergehende Zeichenfolge das Ende des Satzes ist, da das groß geschriebene Zeichen in ein klein geschriebenes Zeichen umgewandelt ist und das Wörterbuch 4020 wieder abgerufen wird, ist es möglich, auch die Zeichenfolge an dem oberen Ende des Satzes in dem Wörterbuch 4020 abzurufen. Wenn ferner eine Zeichenfolge, die mit einem groß geschriebenen Zeichen beginnt, in einem anderen Teil als an dem oberen Teil des Satzes aufscheint, wird dies als ein Eigenname beurteilt, und die Merkmalsinformation für den Eigennamen wird mittels eines Eigennamens mit registrierter Merkmalsinformation versehen, wenn sie davor und danach existiert. Folglich kann ein Eigenname, welcher nicht in dem Wörterbuch 4020 registriert ist, in gewissem Umfang grammatikalisch zergliedert werden.Further, for the character string not registered in the dictionary 4020 , if the first character is a capitalized character and it is determined that the previous character string is the end of the sentence, since the capitalized character is converted to a lowercase character, and When dictionary 4020 is retrieved again, it is possible to retrieve the string at the top of the sentence in dictionary 4020 as well . Further, when a character string beginning with a capitalized character appears in a part other than the upper part of the sentence, it is judged to be a proper name, and the feature information for the proper name is provided with registered feature information by means of a proper name, if they are before and after exists. Thus, a proper name not registered in the dictionary 4020 may be parsed to some extent grammatically.

Da ferner ein Eigenname, der nicht mit einer Merkmalsinformation versehen ist, mit allen notwendigen Merkmalsinformationen versehen wird und die nichterforderliche Merkmalsinformation bei der Verarbeitung für das Wort entfernt wird, ist es möglich, einen Eigennamen, dessen Merkmalsinformation nicht bekannt ist oder einen nicht registrierten Eigennamen zu analysieren.Furthermore, since a proper name, not with a feature information is provided with all the necessary feature information and the non-required feature information removed during processing for the word is, it is possible, a proper name, whose feature information is not known or an unregistered proper name analyze.

Da ferner eine Anzahl Merkmalsinformationen bei einem ganz bestimmten Eigennamen vorgesehen sind und geeignete Merkmalsinformationen in Abhängigkeit von der Merkmalsinformation des Eigennamens davor und danach ausgewählt werden, ist es möglich, entsprechende Merkmalsinformationen für den Fall auszuwählen, daß ein Eigenname mit einer Anzahl von Merkmalsinformationen in Beziehung zu anderen davor und danach grammatikalisch zergliedert wird, um dadurch ein wirksames Zergliedern des eingegebenen Satzes zu ermöglichen.Further, because a number of feature information is at a whole certain proper names are provided and appropriate feature information depending on the feature information of the proper name before and after, Is it possible to provide corresponding feature information for select the case of having a proper name with a number of feature information in relation to others before and is then parsed grammatically to thereby a to enable effective parsing of the entered sentence.

Nunmehr wird eine fünfte Ausführungsform beschrieben, wobei in Fig. 38 der gesamte Aufbau der fünften Ausführungsform eines Sprachanalysators mit Merkmalen der Erfindung dargestellt ist, der bei einer automatischen Übersetzungseinrichtung für Deutsch (Englisch)-Japanisch angewendet wird. Diese Ausführungsform weist einen Eingabeabschnitt 5010 auf, und ein deutscher/englischer Text 5020, welcher ins japanische zu übersetzen ist, wird über den Abschnitt eingegeben. Der Eingabeabschnitt 5010 kann beispielsweise ein Tastenfeld mit Zeichentasten, wie alphanumerischen oder Funktions-Tasten, eine optische Zeichenaufzeichnungseinrichtung (OCR), um den englischen/deutschen Text, der auf Papier aufgezeichnet ist, zu lesen, und/oder eine Dateispeichereinrichtung zum Lesen des deutschen/englischen Textes sein, welcher auf ein Speichermedium, wie eine Magnetplatte aufgezeichnet ist.A fifth embodiment will now be described, in which Fig. 38 shows the entire structure of the fifth embodiment of a speech analyzer having features of the invention applied to a German (English) Japanese automatic translation device. This embodiment has an input section 5010 , and a German / English text 5020 to be translated into Japanese is entered through the section. The input section 5010 may be, for example, a keypad with character keys such as alphanumeric or function keys, an optical character recorder (OCR) to read the English / German text recorded on paper, and / or a file memory device for reading the German / English text, which is recorded on a storage medium, such as a magnetic disk.

Der von dem Eingabeabschnitt 5010 eingegebene deutsche/ englische Text wird in einen Vorredigierabschnitt 5014 gelesen, wo eine Vorbehandlung für die Übersetzung durchgeführt wird. In diesem Fall werden eine Satzerkennung und die Verarbeitung von unbekannten Wörtern hauptsächlich durchgeführt. Diese Funktionen sind Teil der morphologischen Analyse. Die vorredigierten deutschen/englischen Daten werden zusammen mit der Information, die bei der Vorredigierung erhalten worden ist, an einen Abschnittt 5016 für die morphologische Analyse übertragen. Der Abschnitt 5016 teilt die Daten in Sätze auf, während ein Wort/Wörterbuch 5018 abgerufen wird, zergliedert deutsche/englische Morpheme, führt eine Verarbeitung für unbekannte Worte, Eigennamen, verschiedene Zusammensetzungen, wie einen Zeitausdruck, und führt Verarbeitungen für den gesamten Satz durch, wie eine Zusatzfrage und eine Appositionserkennung. Die Regeln für die morphologische Analyse sind in einer Regeldatei 5036 gespeichert.The German / English text input from the input section 5010 is read to a pre-editing section 5014 where pretreatment for the translation is performed. In this case, sentence recognition and the processing of unknown words are mainly performed. These functions are part of the morphological analysis. The pre-processed German / English data, together with the information obtained in the pre-editing, is transferred to a section 5016 for morphological analysis. The section 5016 divides the data into sentences while retrieving a word / dictionary 5018 , dissects German / English morphemes, performs processing for unknown words, proper names, various compositions such as a time expression, and performs processing for the entire sentence, like a supplementary question and an apposition recognition. The rules for the morphological analysis are stored in a rule file 5036 .

Die deutschen/englischen Daten werden nach der morphologischen Analyse zusammen mit der Wörterbuchinformation, welche bei der morphologischen Analyse erhalten worden ist, an einen Abschnitt I 5020 für syntaktische Analyse übertragen. Der Abschnitt I 5020 ist ein Funktionsabschnitt, welcher die Oberflächenstruktur für den Satz grammatikalisch zergliedert, indem grammatikalische Regeln bei den deutschen/englischen Daten angewendet werden, und er findet jede strukturelle Möglichkeit heraus.The German / English data are transferred to a section I 5020 for syntactic analysis after morphological analysis together with the dictionary information obtained in the morphological analysis. Section I 5020 is a functional section that parses the surface structure for the sentence grammatically by applying grammatical rules to the German / English data, and finds out every structural possibility.

Die deutschen/englischen Daten, die in dem Abschnitt I 5020 der syntaktischen Analyse unterzogen worden sind, werden zusammen mit der analysierten Information an einen Abschnitt II 5022 für syntaktische Analyse abgegeben, wobei eine Lösung aus dem Ergebnis der syntaktischen Analyse im Hinblick auf die Oberflächenstruktur durch die syntaktische Analyse I ausgewählt wird, indem die Strukturbeschreibung angewendet wird. Ein plausibler Parsingbaum für den deutschen/englischen Satz wird folglich vorbereitet und dessen Struktur gebildet. Die Regeln für eine syntaktische Analyse sind ebenfalls in der Regeldatei 5036 gespeichert. The German / English data which has been subjected to the syntactic analysis in the section I 5020 , together with the analyzed information, is delivered to a section II 5022 for syntactic analysis, whereby a solution from the result of the syntactic analysis with respect to the surface structure by the syntactic analysis I is selected by applying the structure description. A plausible parsing tree for the German / English sentence is thus prepared and its structure formed. The rules for a syntactic analysis are also stored in the rules file 5036 .

Die deutschen/englischen Daten werden nach Durchführung der syntaktischen Analyse als die Daten des sogenannten Parsing- Baums an einen Struktur-Umwandlungsabschnitt 5024 übertragen. In dem Abschnitt 5024 wird ein Strukturbaum des entsprechenden japanischen Satzes aus dem Strukturbaum vorbereitet, welcher eine Zwischenstruktur des deutschen/englischen Satzes ist, und wird in eine dem japanischen unterliegende Struktur umgesetzt, von der aus der japanische Satz leicht übersetzt werden kann. Die Daten für den Strukturbaum, welcher die dem japanischen zugrundeliegende Struktur anzeigt, welche der Strukturumwandlung unterzogen worden ist, werden an einen Übersetzungsabschnitt 5026 abgegeben, in welchem ein übersetzter Satz erzeugt wird. Dies ist ein Funktionsabschnitt, welcher einen japanischen Satz aus der Baumstruktur des japanischen Strukturbaums erzeugt.The German / English data is transmitted to a structure conversion section 5024 as the data of the so-called parse tree after performing the syntactic analysis. In the section 5024 , a structure tree of the corresponding Japanese sentence is prepared from the structure tree, which is an intermediate structure of the German / English sentence, and is converted into a Japanese underlying structure from which the Japanese sentence can be easily translated. The data for the structure tree indicating the Japanese underlying structure that has undergone the texture conversion is delivered to a translation section 5026 in which a translated sentence is generated. This is a function section that generates a Japanese sentence from the tree structure of the Japanese structure tree.

Japanische Daten als Ergebnis der Übersetzung, d. h. Daten für den übersetzten Satz werden dann an einen Nachredigierabschnitt 5030 angegeben. Der Abschnitt 5030 modifiziert die übersetzten Daten durch Abrufen eines Wörterbuchs 5018 mit Hilfe von Information, welche bei der Übersetzung benutzt worden ist, um einen natürlicheren japanischen Satz zu vervollkommnen. Die Daten für den japanischen Satz werden an einen Ausgabeabschnitt 5032 und als ein Übersetzter japanischer Satz 5034 von dem Ausgabeabschnitt 5032 aus abgegeben. Der Abgabeabschnitt 5032 kann beispielsweise einen Drucker, eine Anzeige und eine Datei-Speichereinrichtung, wie eine Magnetplatte, aufweisen. Ein Fluß einer solchen Serie von Übersetzungsvorgängen wird durch einen Steuerabschnitt 5038 gesteuert, welcher die Steuerung der gesamten Einrichtung regelt. Das Wort-Wörterbuch 5018 speichert Wörterbuchdaten für deutsche/englische und japanische Worte und verschiedene Informationen werden darin festgelegt, wie eine verbindende Beziehung, d. h. eine vorhandene Beziehung, sowie Bedeutungen, eine Singular- oder Pluralform, ein Sprachteil usw. und dies alles zusätzlich zu dem Vokabular. Ferner speichert eine Datei 5036 Regeldaten für die morphologische und syntaktische Analyse.Japanese data as a result of the translation, that is, data for the translated sentence is then given to a post-editing section 5030 . The section 5030 modifies the translated data by retrieving a dictionary 5018 using information used in the translation to perfect a more natural Japanese sentence. The data for the Japanese sentence is output to an output section 5032 and as a translated Japanese sentence 5034 from the output section 5032 . The dispensing section 5032 may include, for example, a printer, a display, and a file storage device, such as a magnetic disk. A flow of such a series of translation operations is controlled by a control section 5038 which controls the control of the entire device. The word dictionary 5018 stores dictionary data for German / English and Japanese words, and various information is set therein, such as a connecting relationship, ie, an existing relationship, as well as meanings, a singular or plural form, a speech part, etc. and all in addition to that Vocabulary. Further, a file 5036 stores rule data for morphological and syntactic analysis.

Der Steuerabschnitt 5038 ist mit einem Bedienungs-Anzeigeabschnitt 5040 verbunden, welcher Bedienungstasten hat, um verschiedene Anzeigen von einem Operator an die erläuterte Einrichtung zu schaffen, so beispielsweise eine Übersetzungs-Anzeigetaste, eine Cursortaste usw. und ein Display oder eine Anzeige, welche visuelle einen eingegebenen deutschen/englischen Text, einen japanischen Text als Ergebnis der Übersetzung, Zwischendaten, wie Wörterbuchinformation, verschiedene Anzeigen für den Operator, usw. darstellt. Sie kann auch so ausgeführt sein, daß das meiste der Bedienungs-Anzeigefunktionen in einem Tastenfeld vorgesehen ist, wenn dies an dem Eingabeabschnitt 5010 angeordnet ist oder in einem Display vorgesehen ist, wenn es an dem Ausgabeabschnitt 5032 angeordnet ist.The control section 5038 is connected to an operation display section 5040 having operation buttons for providing various displays from an operator to the illustrated device, such as a translation display button, a cursor button, etc., and a display or a visual display entered German / English text, a Japanese text as a result of translation, intermediate data, such as dictionary information, various displays for the operator, etc. represents. It may also be designed so that most of the operation display functions are provided in a keypad when it is disposed on the input section 5010 or provided in a display when it is disposed on the output section 5032 .

In Fig. 37 sind detaillierte Strukturen bezüglich der Verarbeitung von Eigennamen in dem Abschnitt 5016 für eine morphologische Analyse dargestellt. Der Abschnitt 5016 ist für den Teil dargestellt, welcher direkte Beziehungen zu dem Verständnis der Erfindung hat, obwohl es natürlich auch andere die morphologische Analyse betreffende Funktionsabschnitte gibt. Die morphologische Analyse wird dadurch durchgeführt, daß das Wörterbuch-Abrufen von der Oberseite der eingegebenen Zeichenfolge nacheinander entsprechend der Abrufschlüssel-Zeichenfolge befohlen wird und die Verarbeitung für die dadurch erhaltene Wörterbuch-Information von dem Wörterbuch-Abrufabschnitt 5104 gemäß der Positionsinformation des Eigennamens durchgeführt wird, was später noch beschrieben wird.In Fig. 37, detailed structures concerning the processing of proper names in the morphological analysis section 5016 are shown. Section 5016 is illustrated for the part having direct relationships to the understanding of the invention, although of course there are other functional sections concerning the morphological analysis. The morphological analysis is performed by commanding the dictionary retrieval from the top of the input character string sequentially according to the retrieval key string and performing the processing for the thus obtained dictionary information from the dictionary retrieving portion 5104 according to the position information of the proper name, which will be described later.

Der Abschnitt 5016 hat einen Eingabe-Verarbeitungsabschnitt 5100 zum Aufnehmen der Daten für die von dem Abschnitt 5014 eingegebene Zeichenfolge und zum Durchführen der Eingabeverarbeitung. Der Abschnitt 5100 ist mit einem Puffer für die eingegebene Zeichenfolge versehen, welcher mit den deutschen/englischen Zeichenfolgedaten in Form von Kodedaten, beispielsweise ASCII versorgt wird, und speichert vorübergehend die Zeichenfolgedaten.The section 5016 has an input processing section 5100 for receiving the data for the character string input from the section 5014 and performing the input processing. The section 5100 is provided with a buffer for the input character string supplied with the German / English character string data in the form of code data such as ASCII, and temporarily stores the character string data.

Die Daten für eine eingegebene Zeichenfolge, die vorübergehend in dem Abschnitt 5100 gespeichert ist, werden an einen Abschnitt 5102 abgegeben, welcher die Daten in Wörterbuch- Bezugseinheiten, wie beispielsweise Worte, aufteilt. Der Abschnitt 5102 ist ein Funktionsabschnitt, welcher die Wörterbuch-Bezugseinheiten unterscheiden, welche die Abrufschlüssel-Zeichenfolgen darstellen, um nacheinander das Wörterbuch 5018 in dem Abschnitt 5104 abzurufen. Die Wörterbuch-Bezugsabgrenzungen, die bei dem Aufteilen für die Wörterbuch-Bezugseinheiten verwendet werden, werden an der jeweiligen Stelle des deutschen/englischen Zeichens als Zahl, Apostroph, Zeichen außer Bindestrich und Punkt sowie als Apostroph, welcher auf ein Leerzeichen folgt, angeordnet. Sie werden in einer Abgrenzungstabelle 5108 gespeichert und auf sie wird beim Aufteilen der Wörterbuch- Bezugseinheiten in dem Abschnitt 5102 Bezug genommen.The data for an input character string temporarily stored in the portion 5100 is delivered to a portion 5102 which divides the data into dictionary reference units such as words. The section 5102 is a functional section which discriminates the dictionary reference units representing the retrieval key strings to successively retrieve the dictionary 5018 in the section 5104 . The dictionary reference boundaries used in dividing the dictionary reference units are arranged at the respective location of the German / English character as a number, an apostrophe, a character other than a hyphen and dot, and an apostrophe followed by a space. They are stored in a delimitation table 5108 and are referred to when dividing the dictionary reference units in section 5102 .

Das Bezugswörterbuch 5018 speichert insbesondere die Information zum Abrufen der Aufteileinheit. Beispielsweise ist, wie in Fig. 38 für das Beispiel der Eingangsinformation dargestellt ist, Grammatikinformation, wie ein Satzteil für jede der Wörterbuch-Bezugseinheiten beispielsweise ein Eintrag für ein Wort, enthalten. Die Sprachteil-Information enthält für das Hauptwort einen Hinweis, wenn es ein allgemeines Hauptwort oder ein Eigenname ist. Für den Eigennamen wird ein Unterscheidungshinweis, welcher den Weg aufzeigt, die Position in dem Satz zu begrenzen, d. h. eine Positionsinformation für den Eigennamen gespeichert. Dies wird später noch im einzelnen beschrieben. Es werden auch als andere Informationen beispielsweise eine Abzählbarkeit oder Nicht-Abzählbarkeit eines Hauptworts, eine Unterscheidung, wie intransitives oder transives Verb, deren Übersetzung usw. registriert. The reference dictionary 5018 stores, in particular, the information for retrieving the apportionment unit. For example, as shown in Fig. 38 for the example of the input information, grammar information such as a sentence part for each of the dictionary reference units, for example, an entry for a word, is included. The language part information contains an indication for the noun if it is a general noun or a proper noun. For the proper name, a discrimination notice indicating the way to limit the position in the sentence, that is, position information for the proper name is stored. This will be described later in detail. Also, as other information, for example, a countability or non-countability of a main word, a distinction such as intransitive or transitive verb, their translation, etc. are registered.

Es gibt vier Arten von Positionsinformationen für den Eigennamen, d. h. in der vorliegenden Ausführungsform "0" bis "3". Das Muster "0" zeigt einen Eigennamen ohne Positionsbeschränkung, beispielsweise "Stadt/City" oder einen Personennamen "Walter" an. Das Muster "1" zeigt an, daß es ein Eigenname, beispielsweise "Mr." ist, der am oberen Ende eines einzigen Eigennamens angeordnet ist, oder eine Folge einer Anzahl von Eigennamen, d. h. einen Eigennamen, der in einer einzigen Gruppe von Eigennamen angeordet ist. Das Muster "2" zeigt an, daß es ein Eigenname ist, beispielsweise "Bahnhof/Station" oder "Bucht/Bay", die am Ende eines einzelnen Eigennamens angeordnet sind, oder daß es ein Eigenname ist, der in einem Wort als eine Gruppe von Eigennamen angeordnet ist, oder welcher anders als das Muster "3", das nachstehend beschrieben wird. Das Muster "3" zeigt an, daß es ein Eigenname ist, beispielsweise "River" in "the Sumida River", was dasselbe wie das Muster "2" ist, jedoch von einem bestimmten Artikel "the/der" am Anfang eines Eigennamens der in einem Eigennamen als eine Gruppe von Eigennamen angeordnet ist.There are four kinds of position information for the proper name, d. H. in the present embodiment, "0" to 3". The pattern "0" shows a proper name without positional restriction, for example "city / city" or one Person name "Walter". The pattern "1" indicates that it's a proper name, for example "Mr." is the one at the top End of a single proper name is arranged, or one Sequence of a number of proper names, d. H. a proper name, which is arranged in a single group of proper names. The pattern "2" indicates that it is a proper name, for example "Station / Station" or "Bay / Bay" at the end of a single proper name, or that it is a Is proper name in one word as a group of proper names is arranged, or which is different than the pattern "3" described below. The pattern "3" shows assume that it is a proper name, for example, "River" in "the Sumida River," which is the same as the pattern "2," however, from a particular article "the / the" at the beginning of a Nickname in a proper noun as a group of Proper name is arranged.

Der Wörterbuch-Abrufabschnitt 5104 ist ein Funktionsabschnitt, welcher eine Information durch Abrufen des Wort- Wörterbuchs 5013 abgibt, was auf der Abrufschlüssel-Zeichenfolge basiert, welche in den Abschnitt 5102 eingegeben worden ist, und überträgt dasselbe an die Wörterbuch-Informations- Konservierungstabelle 5124, an den Positionsinformations- Verarbeitungsabschnitt 5110 und an den Abschnitt 5111, welcher das vorhergehende Satzende beurteilt. Die Verarbeitung aufgrund der Muster "0" bis "3" entsprechen der Positionsinformation für die Eigennamen, die aus dem Wörterbuch 5018 abgerufen worden sind, werden durch Eigennamen- Verarbeitungsabschnitte 5114, 5116 und 5118 durchgeführt. Verarbeitungen für die Eigennamen werden mittels des Musters "1" in dem Abschnitt 5114, durch die Muster "2" und "3" in dem Abschnitt 5116 und durch das Muster "0" in den Abschnitten 5118 bzw. 5114 durchgeführt. The dictionary retrieval section 5104 is a functional section which issues information by retrieving the word dictionary 5013 based on the retrieval key string which has been input to the section 5102 , and transmits the same to the dictionary information preservation table 5124 . to the position information processing section 5110 and to the section 5111 which judges the previous sentence end. The processing based on the patterns "0" to "3" corresponding to the position information for the proper names retrieved from the dictionary 5018 are performed by proper name processing sections 5114, 5116, and 5118 . Processes for the proper names are performed by the pattern "1" in the section 5114 , by the patterns "2" and "3" in the section 5116, and by the pattern "0" in the sections 5118 and 5114, respectively.

In dieser Ausführungsform werden Eigennamen kollektiv angeordnet, indem beispielsweise als Schlüssel ein Wort verwendet wird, das einen Teil einer Gruppe von Eigennamen darstellt, die in einem einzigen Eigennamen gruppiert sind, und unterzieht sie einer Positionsbeschränkung, wenn sie als ein Eigenname angeordnet sind. Selbst wenn eine Anzahl Eigennamen kontinuierlich vorhanden ist, können sie in geeigneter Weise zusammen mit dem Kontext ohne eine solche fehlerhafte Zusammenstellung angeordnet werden, indem sie einfach immer als eine einzige Gruppe von Eigennamen angeordnet werden. Die Verarbeitung für diesen Zweck wird in den Abschnitten 5114, 5116 und 5118 durchgeführt.In this embodiment, proper names are collectively arranged by, for example, using as a key a word representing a part of a group of proper names grouped in a single proper name and subjecting them to a positional restriction when arranged as a proper name. Even if a number of proper names are continuously present, they can be appropriately arranged together with the context without such erroneous composition simply by always being arranged as a single group of proper names. The processing for this purpose is performed in Sections 5114, 5116 and 5118 .

Eigennamen mit einem gewissen Umfang werden in dem Bezugswörterbuch 5018 registriert. Derartige im Wörterbuch registrierte Eigennamen werden einer grammatikalischen Zergliederung bzw. morphologischen Analyse in dem Abschnitt 5110 und den Abschnitten 5114, 5116 und 5118 unterzogen. Sie bilden einen Verarbeitungsabschnitt für im Wörterbuch registrierte Eigennamen. Die Eigennamen, die nicht in dem Wörterbuch 5018 registriert sind, werden in dem das vorhergehende Satzende beurteilenden Abschnitt 5112 und in dem Abschnitt 5118 zum Verarbeiten eines dem Muster "0" entsprechenden Eigennamens grammatikalisch zergliedert bzw. morphologisch analysiert. Diese bilden den Verarbeitungsabschnitt für nicht im Wörterbuch registrierte Eigennamen.Proper names with a certain scope are registered in the reference dictionary 5018 . Such proper names registered in the dictionary are subjected to a grammatical analysis in the section 5110 and the sections 5114, 5116 and 5118 . They form a processing section for proper names registered in the dictionary. The proper names not registered in the dictionary 5018 are grammatically parsed in the preceding sentence end judging section 5112 and in the proper name processing section 5118, respectively, for the pattern "0". These form the processing section for proper names not registered in the dictionary.

Eine Verarbeitung eines Eigennamens wird durch die folgenden zwei Schritte durchgeführt. Zuerst wird ein Eigenname in der eingegebenen Zeichenfolge erkannt. Im Falle eines in dem Wörterbuch 5018 registrierten Wortes erfolgt dies dadurch, daß der Eigenname in dessen Morphem-Betätigungsinformation angezeigt wird. Im Falle eines Wortes, das nicht in dem Wörterbuch 5018 registriert ist, erfolgt dies so, daß ein Zeichen am oberen Ende ein deutsches/englisches großgeschriebenes Zeichen ist, beispielsweise "John" oder "U. S." usw. Processing of a proper name is performed by the following two steps. First, a proper name is recognized in the input string. In the case of a word registered in the dictionary 5018 , this is done by displaying the proper name in its morphemic operation information. In the case of a word not registered in the dictionary 5018 , it is done so that a character at the top is a German / English uppercase character, for example, "John" or "US", etc.

Dann wird eine Gruppe von Eigennamen kollektiv angeordnet, um den ganzen Teil zu einem einzigen Eigennamen zu machen. Wenn er als ein Eigenname aus der Wörterbuchinformation erkannt wird und wenn die nächste Wörterbuch-Bezugseinheit auch ein Eigenname ist, wird der ganze Teil kollektiv zu einem Eigennamen zusammengesetzt. Beispielsweise wird "M. Weber" als Ganzes als ein Eigenname analysiert. Das Ergebnis der Analyse bildet einen Kandidaten, um den ideomatischen Ausdruck einschließlich Eigennamen in der lokalen grammatikalischen Zergliederung bzw. Analyse zu gruppieren.Then a group of proper names is arranged collectively, to make the whole part a single proper name. If recognized as a proper name from the dictionary information and if the next dictionary reference unit is also a proper name, the whole part becomes collective composed of a proper name. For example, "M. Weber "analyzed as a whole as a proper name The analysis forms a candidate to the ideological Expression including proper names in the local group grammatical dissection or analysis.

Dann wird die notwendige lokale Analyse durchgeführt. In diesem Fall werden eine Folge von sogenannten Parsing- Einheiten, welche durch die Morphem-Betätigungsinformation für jede der Parsing-Einheiten betätigt worden sind, kollektiv in einer Parsing-Einheit angeordnet, was auf einer lokalen Parsing-Regel beruht. Beispielsweise wird "Mr. Brown" in "Brown shi" angeordnet. Ferner werden Worte, die einen Teil eines Distriktnamens darstellen auch kollektiv angeordnet. Beispielsweise wird "Lake Biwa" in "Biwako" zusammengestellt. Auf dieselbe Weise werden Worte, die einen Teil eines Gruppennamens darstellen, auch kollektiv angeordnet. Beispielsweise wird "Yale University" als "Yale Daigaku" analysiert.Then the necessary local analysis is performed. In In this case, a sequence of so-called parsing Units indicated by the morpheme actuation information for each of the parsing units have been operated collectively arranged in a parsing unit, which is on a local Parsing rule is based. For example, "Mr. Brown" arranged in "Brown shi". Furthermore, words that have a Part of a district name also represent collectively. For example, "Lake Biwa" is put together in "Biwako". In the same way words become a part of a group name, also collectively arranged. For example, "Yale University" is called "Yale Daigaku" analyzed.

Im Falle des Eigennamens "Mr. . . ." und "Lake . . ." ist ein Ende im Hinblick auf den Kontext immer gerade davor vorhanden. Wenn folglich "Tom Brown" kollektiv zu einem einzigen Eigennamen zusammengefaßt ist, wird ein Fehler in der folgenden Analyse bewirkt. Beispielsweise folgt auf einen Eigennamen "Universität/University" immer unmittelbar danach ein Ende. Beispielsweise in einem englischen Satz "At Yale University Tom ist . . ." wird erkannt, daß es eine Unterbrechung zwischen "University" und "Tom" gibt. In dieser Ausführungsform wird Information für die Position, an welcher entsprechende Eigennamen in der Folge von Eigennamen einer Positionsbeschränkung unterliegen, in dem Wörterbuch 5018 als die vorstehend beschriebene Positionsinformation gespeichert, d. h. als Muster "0" bis "3". Die kollektive Anordnung mit diesen Positionsinformationen wird in den Verarbeitungsabschnitten 5110, 5112, 5114, 5116 und 5118 durchgeführt. Die Wörterbuchinformation für die eingegebene Zeichenfolge nach der Beendigung dieser Vorgänge wird in dem Puffer für eine abgerufene Wörterbuch-Information, d. h. in der Wörterbuch-Informations-Konservierungstabelle 5124 gespeichert.In the case of the proper name "Mr. ...." and "Lake ...." There is always an end to the context in front of it. Consequently, when "Tom Brown" is collectively grouped into a single proper name, an error is made in the following analysis. For example, a proper name "University / University" is always immediately followed by an end. For example, in an English sentence "At Yale University Tom is ...." it is recognized that there is an interruption between "University" and "Tom". In this embodiment, information for the position at which respective proper names in the sequence of proper names undergo a positional restriction is stored in the dictionary 5018 as the above-described position information, that is, as patterns "0" to "3". The collective arrangement with this position information is performed in the processing sections 5110, 5112, 5114, 5116 and 5118 . The dictionary information for the input character string after the completion of these operations is stored in the retrieved dictionary information buffer, that is, in the dictionary information preservation table 5124 .

Das Ergebnis der morphologischen Analyse wird von der Tabelle 5124 an den Abschnitt I 5020 für morphologische Analyse übertragen. Die Verarbeitung durch die Eigennamen- Positionsinformation wird mit Hilfe der Folge durchgeführt, die in Fig. 40 dargestellt ist. Eine Eingabeverarbeitung wird durchgeführt, indem die Daten für die eingegebene Zeichenfolge in den Eingabe-Verarbeitungsabschnitt 5100 aufgenommen werden (5200). Dann teilt der Abschnitt 5102 die eingegebene Zeichenfolge in Wörterbuch-Bezugseinheiten zum Abrufen des Wörterbuchs 5018 ein (5201). Der Abschnitt 5104 ruft das Wörterbuch 5018 dementsprechend ab (5203), und wenn es einen Wörterbuch-Eingang (5204) gibt, prüft der dessen Sprachteil (5205). Wenn der Sprachteil kein Eigenname ist, wird eine Verarbeitung für den Eigennamen in dieser Ausführungsform nicht durchgeführt, da die Wörterbuch-Information in der Tabelle 5124 gespeichert ist (5206). Wenn es ein Eigenname ist, wird die Verarbeitung für den im Wörterbuch registrierten Eigennamen 5207 in dem Abschnitt 5110 und den Abschnitten 5114, 516 und 5118 durchgeführt. Wenn diese Verarbeitungen an der Endposition des Satzes durchgeführt werden, welcher durch die Daten der eingegebenen Zeichenfolge angezeigt ist (5202), wird das Ergebnis der morphologischen Analyse an den hierfür vorgesehenen Abschnitt I 5020 abgegeben (5210).The result of the morphological analysis is transferred from the table 5124 to the section I 5020 for morphological analysis. The processing by the proper name position information is performed by means of the sequence shown in FIG . Input processing is performed by taking the data for the input character string into the input processing section 5100 ( 5200 ). Then, the section 5102 divides the entered character string into dictionary reference units for retrieving the dictionary 5018 ( 5201 ). Section 5104 retrieves dictionary 5018 accordingly ( 5203 ), and if there is a dictionary entry ( 5204 ), checks its dictionary ( 5205 ). When the speech part is not a proper name, processing for the proper name is not performed in this embodiment because the dictionary information is stored in the table 5124 ( 5206 ). If it is a proper name, the processing for the dictionary registered proper name 5207 in the section 5110 and the sections 5114, 516, and 5118 is performed. When these processings are performed at the end position of the sentence indicated by the data of the input character string ( 5202 ), the result of the morphological analysis is given to the designated section I 5020 ( 5210 ).

Wenn als Ergebnis der Wörterbuch-Referenz kein Eingang beim Schritt 5204 vorhanden ist, und wenn das Element von einem groß geschriebenen Zeichen startet (5212) wird dies als ein Eigenname erkannt, der nicht dem Wörterbuch registriert ist, und eine Verarbeitung für einen nicht im Wörterbuch registrierten Eigennamen 5213 wird in dem Abschnitt 5112 zum Beurteilen eines vorhergehenden Teils und in dem Eigennamen- Verarbeitungsabschnitt 5118 durchgeführt. Wenn das Anfangszeichen kein groß geschriebenes Zeichen ist, da dies ein in dem Wörterbuch 5018 nicht registriertes Wort ist, wird es als ein nichtregistriertes Wort in der Tabelle 5124 aufbewahrt (5214). Die Verarbeitung wird dann an der Endposition (5202) durchgeführt.If, as a result of the dictionary reference, there is no input at step 5204 , and the element starts from a capitalized character ( 5212 ), it is recognized as a proper name not registered to the dictionary and a processing for one not in the dictionary registered proper name 5213 is performed in the preceding part judging section 5112 and in the proper name processing section 5118 . If the beginning character is not a capitalized character, since this is a word not registered in the dictionary 5018 , it is kept as a non-registered word in the table 5124 ( 5214 ). The processing is then performed at the end position ( 5202 ).

Die Verarbeitung für den im Wörterbuch registrierten Eigennamen 5207 wird in den Verarbeitungsabschnitten 5110, 5114, 5116 und 5118 in dem in Fig. 41 dargestellten Flußdiagramm durchgeführt. Zuerst wird die erhaltene Positionsinformation, welche in der Wörterbuch-Information enthalten ist, in Bezug gesetzt (52209). Die Eigennamen-Verarbeitung 5221 für das Muster "0" wird durchgeführt, wenn sie das Muster "0" anzeigt; die Eigennamen-Verarbeitung 5220 wird für das Muster "1" durchgeführt, wenn sie das Muster "1" anzeigt, und die Eigennamen-Verarbeitung 5223 wird für das Muster 2, 3 durchgeführt, wenn sie Muster "2" bzw. "3" anzeigt. Die Verarbeitung 5221 für das Muster "0" wird in dem Abschnitt 5114 durchgeführt. Die Verarbeitung wird bei einem Eigennamen angewendet, der keine Positionsbeschränkung hat. Zuerst wird, wenn ein Teil, welcher der infrage kommenden Wörterbuch- Bezugseinheit vorausgeht, ein nichtregistrierter Eigenname ist (5230) der ganze Teil kollektiv in einem Eigennamen angeordnet, wobei die Positionsinformation "1" ist, und wird in der Tabelle 5124 gespeichert (5233). Wenn der vorhergehende Teil ein Eigenname mit Positionsinformation "1" ist 852319 wird die Verarbeitung auf dieselbe Weise durchgeführt.Processing for the dictionary-registered proper name 5207 is performed in the processing sections 5110, 5114, 5116, and 5118 in the flowchart shown in FIG . First, the obtained position information included in the dictionary information is referenced ( 52209 ). The proper name processing 5221 for the pattern "0" is performed when displaying the pattern "0"; the proper name processing 5220 is performed for the pattern "1" when displaying the pattern "1", and the proper name processing 5223 is performed for the pattern 2, 3 when displaying patterns "2" and "3", respectively , The processing 5221 for the pattern "0" is performed in the section 5114 . Processing is applied to a proper name that has no positional constraint. First, when a part which precedes the dictionary reference unit concerned is a non-registered proper name ( 5230 ), the whole part is collectively arranged in a proper name, the position information being "1", and is stored in the table 5124 ( 5233 ) , If the previous part is a proper name with position information "1" 852319 , the processing is performed in the same way.

Wenn der vorhergehende Teil ein Eigenname mit einer Positionsinformation "0" ist (85231) wird der ganze Teil kollektiv zu einem Eigennamen mit einer Positionsinformation "0" zusammengestellt und in der Tabelle 5124 gespeichert (5235). Wenn ferner der vorhergehende Teil kein Eigenname mit einer Positionsinformation von "0" ist, wird der ganze Teil allein als ein Eigenname mit einer Positionsinformation "0" in der Tabelle 5124 gespeichert (5134).If the preceding part is a proper name having position information "0" ( 85231 ), the whole part is collectively assembled into a proper name having position information "0" and stored in the table 5124 ( 5235 ). Further, if the previous part is not a proper name having position information of "0", the whole part is stored alone as a proper name with position information "0" in the table 5124 ( 5134 ).

Die Eigennamen-Bearbeitung 5222 für das Muster 1 wird so, wie unten beschrieben durchgeführt. Die Bearbeitung wird bei einem Eigennamen angewendet, wie beispielsweise "Mr.", der am Anfang eines einzigen Eigennamens oder am Anfang eines Eigennamens angeordnet ist, der in einer Gruppe als eine Folge einer Anzahl Eigennamen angeordnet ist. Zuerst wird, wenn der Teil, der der infrage kommenden Bezugseinheit vorangeht, ein nichtregistrierter Eigenname ist (4240), das Wort in nichtregistriert umgewandelt (5241). Wenn es nicht ein nichtregistriertes Wort ist, wird es allein als ein Eigenname mit der Positionsinformation pos "1" in der Tabelle 5124 gespeichert (5242).The proper name processing 5222 for the pattern 1 is performed as described below. The processing is applied to a proper name, such as "Mr.", which is located at the beginning of a single nickname or at the beginning of a nickname organized in a group as a sequence of a number of nicknames. First, if the part preceding the candidate reference unit is a non-registered proper name ( 4240 ), the word is converted to unregistered ( 5241 ). If it is not a non-registered word, it is stored alone as a proper name with the position information pos "1" in the table 5124 ( 5242 ).

Nunmehr wird die Eigennamen-Verarbeitung 5523 für die Muster 2, 3 anhand von Fig. 44 beschrieben. Die Verarbeitung wird beispielsweise für einen Eigennamen wie "Bahnhof" ("Station") oder "Fluß" ("River") angewendet, die am Ende eines einzigen Eigennamens angeordnet sind, oder bei einem Eigennamen, der in einer Gruppe als eine Folge einer Anzahl von Eigennamen angeordnet ist. Zuerst wird, wenn der Teil, welcher der infrage kommenden Bezugseinheit vorangeht, ein nicht registrierter Eigenname ist (5250) dieser Teil kollektiv zusammen mit dem vorhergehenden Wort als ein Eigenname mit seiner eigenen Positionsinformation pos-self als die Eigennamen-positionsinformation pos zusammengestellt und in der Tabelle 5124 gespeichert (5225). Ferner wird die Verarbeitung auf dieselbe Weise durchgeführt, wenn der vorhergehende Teil ein Eigenname mit der Positionsinformation "1" ist. (5251). Now, the proper name processing 5523 for the patterns 2, 3 will be described with reference to FIG. 44. The processing is applied, for example, to a proper name such as "station" or "river" arranged at the end of a single proper name, or a proper name belonging to a group as a sequence of a number is arranged by proper name. First, if the part which precedes the candidate reference unit is an unregistered proper name ( 5250 ), that part collectively collects together with the previous word as a proper name with its own position information pos-self as the proper name position information pos and in the Table 5124 is stored ( 5225 ). Further, the processing is performed in the same manner when the previous part is a proper name with the position information "1". ( 5251 ).

Wenn der vorhergehende Teil kein Eigenname mit Positions- Information "0" ist, (5252) wird er allein in einem Eigennamen mit einer einzigen Positionsinformation pos-self als die Eigennamen-Positionsinformation angeordnet und in der Tabelle 5124 gespeichert (5257).If the preceding part is not a proper name with position information "0", it is ( 5252 ) arranged in a proper name with a single position information pos-self as the proper name position information and stored in the table 5124 ( 5257 ).

Wenn beim Schritt 5252 der vorhergehende Teil ein Eigenname der Positionsinformation "0" ist, wird dessen eigene Positionsinformation pos-self überprüft (5253) und die Verarbeitung 5255 wird durchgeführt, wenn es das Muster "2" ist. Wenn die eigene Positionsinformation pos-self das Muster "3" ist, wird ferner geprüft, ob das Element vor einer Bezugseinheit "der" ("the") ist oder nicht. Wenn es nicht der bestimmte Artikel "der" ist, wird die Verarbeitung 5255 durchgeführt. Wenn es "der" ist, wird die Gruppe aus "der" mit dem eigenen Element kollektiv als ein Eigenname mit der Positionsinformation "3" zusammengestellt und in der Tabelle 5124 gespeichert (5256).If, at step 5252, the previous part is a proper name of the position information "0", its own position information is checked pos-self ( 5253 ), and the processing 5255 is performed if it is the pattern "2". If the own position information pos-self is the pattern "3", it is further checked whether the item before a reference unit is "the" or not. If it is not the particular article "the", processing 5255 is performed. If it is "the", the group of "the" with the own item is collectively assembled as a proper name with the position information "3" and stored in the table 5124 ( 5256 ).

Für ein Wort, das mit einem groß geschriebenen Buchstaben beginnt und als ein nicht registriertes Wort erkannt wird, für welches kein Eintrag in dem Wörterbuch 1518 als Ergebnis des Abrufvorgangs 5203 vorhanden ist, wird im Flußdiagramm mittels des Schritts 5204 und 5212 auf die Verarbeitung 5213 übergegangen und die Verarbeitung 5213 wird in dem das vorhergehende Satzende beurteilenden Abschnitt 5112 durchgeführt. Zuerst wird, wenn der Teil, welcher der infrage kommenden Bezugseinheit vorangeht, kein Kandidat für das Satzende ist, die Verarbeitung 5221 für das Muster 0, wie oben beschrieben, in dem Verarbeitungsabschnitt 5118 durchgeführt.For a word that begins with a capitalized letter and is recognized as an unregistered word for which there is no entry in the dictionary 1518 as a result of the retrieval operation 5203 , the flow is switched to the processing 5213 by means of steps 5204 and 5212 and the processing 5213 is performed in the previous sentence end judging section 5112 . First, when the part which precedes the candidate reference unit is not a candidate for the sentence end, the processing 5221 for the pattern 0 is performed in the processing section 5118 as described above.

Der vorhergehende Teil kann ein Kandidat für das Satzende in den folgenden vier Fällen sein. Das erste ist ein Fall, bei welchem ein gesonderter Punkt "." vorhanden ist. Das nächste ist ein Fall, bei welchem der vorhergehende Eintrag der Punkt an der letzten Stelle ist, und die Positionsinformation für den Eigennamen nicht "21" ist. Dieser Fall schließt beispielsweise eine Abkürzung, wie "U. S. A." ein. Ferner gibt es den Fall eines Doppelpunkts ":", eines Semikolons ";", eine Folge aus einem Punkt und einem Apostroph ".'" schreiben und eine Folge aus einem Punkt und einem Anführungszeichen "."". Der letzte Fall ist der Fall, daß es sich an der Oberseite des Puffers für eine eingegebene Zeichenfolge befindet.The previous part may be a candidate for the end of the sentence in the following four cases. The first is a case where a separate dot "." is available. The next is a case where the previous entry the point is at the last place, and the position information for the proper name is not "21". This case for example, includes an abbreviation such as "U.S.A." on. There is also the case of a colon ":", a semicolon ";", a sequence of a dot and an apostrophe Write "." and a sequence of one dot and one Quotation marks "." ". The last case is that it is at the top of the buffer for an input String is located.

Wenn der vorhergehende Teil einer der vorstehend beschriebenen vier Fälle ist, wird der vorhergehende Kandidat für das Satzende als das Satzende erkannt (5261), und das Wörterbuch wurde abgerufen, nachdem der groß geschriebene Buchstabe des Worts in einen kleinen Buchstaben umgewandelt wurde (5262). Wenn als Ergebnis des Abrufens ein Wörterbuch- Eintrag erhalten wird (5263), wird er in der Tabelle 5124 aufgezeichnet (5264). Wenn nicht wird er als ein nichtregistrierter Eigenname in der Tabelle 5264 aufgezeichnet, wobei das obere Zeichen unverändert als der groß geschriebene Buchstabe belassen wird (5265).If the previous part is one of the four cases described above, the previous candidate for the sentence end is recognized as the sentence end ( 5261 ), and the dictionary was retrieved after the uppercase letter of the word has been converted to a lower case letter ( 5262 ). If a dictionary entry is obtained as a result of retrieval ( 5263 ), it is recorded in table 5124 ( 5264 ). If not, it is recorded as a non-registered proper name in the table 5264 with the upper character left unchanged as the capitalized letter ( 5265 ).

Nunmehr erfolgt die Erläuterung eines weiteren Beispiels. Wenn beispielsweise ein Wörterbuch auf eine eingegebene Zeichenfolge "Entlang des Sumida River gingen Paul and mr. Gold Smith . . ." ("Along the Sumida River Paul and Mr. Gold Smith went . . .") zurückgeführt wird, wird die Wörterbuch- Eingangsinformation zuerst in die Tabelle 5124 geschrieben, wie in Fig. 64A dargestellt ist. Beispielsweise ist für "des" bzw. "the" die Ausgangsposition in dem Satz "7" und die Endposition ist "9", und der Sprachteil ist ein bestimmter Artikel. Kein Eintrag kann für das Wort "Entlang" ("Along") an dem oberen Ende der eingegebenen Zeichenreihe bei dem Wörterbuch-Abrufen 5203 erhalten werden, und es wird als nicht registriert festgestellt. Da jedoch der vorhergehende Teil der Bedingung des Kandidaten für das Satzende genügen kann, da es an dem oberen Ende des Eingabepuffers vorhanden ist (5260), wird der große Anfangsbuchstabe "E" bzw. "A" am Anfang in einen kleinen Buchstaben umgewandelt und das Wörterbuch-Abrufen 5262 wird als "entlang" "along" durchgeführt.Now, the explanation of another example will be given. For example, if a dictionary entered an input string "Along the Sumida River, Paul and Mr. Gold Smith. ("Along the Sumida River Paul and Mr. Gold Smith went ..."), the dictionary input information is first written to the table 5124 as shown in Fig. 64A. For example, for "the" or "the", the starting position in the sentence is "7" and the ending position is "9", and the speech part is a particular article. No entry can be obtained for the word "Along" at the top of the input character string at dictionary retrieval 5203 , and it is determined to be unregistered. However, since the previous part can satisfy the condition of the candidate for the sentence end, since it is present at the upper end of the input buffer ( 5260 ), the large initial letter "E" or "A" is initially converted to a small letter and the Dictionary fetch 5262 is performed as "along""along".

Dann wird der Zeiger inkrementiert, um zu der Verarbeitung für "Sumida" überzugehen. Dies Wort ist in der vorliegenden Ausführungsform nicht in dem Wörterbuch 5018 registriert. Da der vorhergehende Teil kein Kandidat für das Satzende ist, wird der Fluß an die Eigennamen-Verarbeitung 5221 für das Muster 0 übertragen. Wie in Fig. 46A dargestellt, werden ein Eigenname für den Sprachteil und "0" für die Eigennamen- Positionsinformation abgerufen.Then the pointer is incremented to proceed to processing for "Sumida". This word is not registered in the dictionary 5018 in the present embodiment. Since the previous part is not a candidate for the sentence end, the flow is transferred to the proper name processing 5221 for the pattern 0. As shown in Fig. 46A, a proper name for the speech part and "0" for the proper name position information are retrieved.

Die nächste Bezugseinheit "Fluß" ("River") ist ein Eigenname mit der Positionsinformation "3". Der vorhergehende Teil ist ein Eigenname mit der Positionsinformation "0", und der weitere vorhergehende Teil ist "des" ("the"). Im Hinblick darauf wird der Ausdruck "des Sumida Flusses" ("the Sumida River") kollektiv als ein einziger Eigenname durch die Schritte 5250 bis 5254 zusammengestellt und als die Positionsinformation "3" in der Tabelle 5124 gespeichert (Fig. 46B). Dann ist die nächste Bezugseinheit "P" ein nicht registrierter Eigenname mit der Positionierinformation "0", an welcher eine Verarbeitung 5213 durchgeführt wird. Obwohl das vorhergehende Wort ein Eigenname ist, da die Positionsinformation hierfür "3" ist, ist dadurch nichts kollektiv zusammengestellt, sondern die Wörterbuch-Information wird so wie sie ist in der Tabelle 5124 gespeichert (Fig. 46C). Eine gewöhnliche Verarbeitung wird für die nachfolgende Konjunktion "und" angewendet.The next reference unit "River" is a proper name with the position information "3". The previous part is a proper name with the position information "0", and the other preceding part is "the". In view of this, the term "the Sumida River" is collectively assembled as a single proper name through steps 5250 to 5254 and stored as the position information "3" in the table 5124 ( Fig. 46B). Then, the next reference unit "P" is an unregistered proper name having the positioning information "0" at which processing 5213 is performed. Although the previous word is a proper name because the position information therefor is "3", nothing is collected collectively, but the dictionary information is stored as it is in the table 5124 ( Fig. 46C). Ordinary processing is applied to the subsequent conjunction "and".

Das nächste Wort "Mr." ist ein Eigenname mit der Positionsinformation "1", welche so, wie sie ist, in der Tabelle 5124 gespeichert wird (Fig. 46D). Selbst wenn der vorhergehende Teil ein Eigenname "Paul" ist, kann, da eine Zeichensetzung zwischen Worten unmittelbar von "Mr." vorhanden ist, dies so wie es ist, in der Tabelle 5124 gespeichert werden. The next word "Mr." is a proper name with the position information "1" stored in the table 5124 as it is ( FIG. 46D). Even if the previous part is a proper name "Paul", since punctuation between words may be directly from "Mr." is present as it is in table 5124 .

Ferner ist das Wort "Gold" ein Eigenname, der nicht in dem Wörterbuch registriert ist und bei dem die Verarbeitung 5213 angewendet wird. Da das vorhergehende Wort "Mr." eine Positionsinformation "1" hat, werden beide kollektiv zusammengefaßt, und der ganze Teil wird in einem Eigennamen mit der Positionsinformation "1" ausgebildet (Fig. 46E). Dann ist die Verarbeitung bei dem nächsten Wort "Smith" ähnlich (Fig. 46F). Das anschließende Wort "gingen" ("went") ist eine Vergangenheitsform eines Verbs und es wird nachstehend eine übliche grammatikalische Zergliederung bzw. Analyse durchgeführt.Further, the word "gold" is a proper name that is not registered in the dictionary and in which the processing 5213 is applied. Because the previous word "Mr." has position information "1", both are collectively collected, and the whole part is formed in a proper name with the position information "1" ( Fig. 46E). Then, the processing at the next word "Smith" is similar ( Fig. 46F). The subsequent word "went" is a past tense of a verb and a common grammatical analysis is performed below.

Wie vorstehend anhand der erläuterten Ausführungsform beschrieben ist, werden Eigennamen dadurch angeordnet, daß als Schlüssel ein Wort verwendet wird, das einen Teil einer Gruppe von Eigennamen darstellt, welche kollektiv zu einem einzigen Eigennamen zusammengefaßt werden, und das eine Positionsbeschränkung durchmacht, wenn es kollektiv in einem Eigennamen zusammengefaßt wird. Selbst wenn eine Anzahl Eigennamen ständig vorhanden ist, ist es auf diese Weise möglich, eine richtige kollektive Zusammenstellung zusammen mit den Kontext ohne eine fehlerhafte kollektive Zusammenstellung durchzuführen, indem sie einfach in einer Gruppe von Eigennamen kopiert werden. In dem vorerwähnten Beispiel werden die Worte "des Sumida Flusses" ("the Sumida river") als eine Gruppe von Eigennamen analysiert, die von dem nachfolgenden Wort "Paul" getrennt sind. Ferner wird auch "Mr. Gold Smith" als eine Gruppe aus Eigennamen analysiert.As explained above with reference to the explained Embodiment is described, proper names thereby arranged that a word is used as the key represents a part of a group of proper names, which collectively grouped into a single proper name, and that undergoes a positional restriction when it does Collectively summarized in a proper name. Even if a number of proper names are constantly present, it is In this way possible, a proper collective compilation along with the context without a flawed one perform collective compilation simply by be copied in a group of proper names. In the aforementioned Example will be the words "Sumida River" ("the Sumida river") analyzed as a group of proper names, separated from the subsequent word "Paul" are. Further, Mr. Gold Smith also turns out to be a group Analyzed proper names.

Nunmehr wird anhand von Fig. 47 eine sechste Ausführungsform des Sprachanalysators mit Merkmalen nach der Erfindung beschrieben, der bei einer automatischen Übersetzungseinrichtung Englisch/ Deutsch-Japanisch angewendet wird. Diese Ausführungsform hat einen Eingabe-Verarbeitungsabschnitt 6014, und Daten werden in den Abschnitt 6014 von einer Eingabeeinheit 6012 aus eingegeben. Die Eingabeeinheit 6012 weist beispielsweise ein Tastenfeld mit Zeichentasten, wie alphanumerischen und Funktionstasten, eine optische Zeichenleseeinrichtung zum Lesen des auf Papier aufgezeichneten englischen/deutschen Textes und eine Leseeinrichtung für eine Magnetplatte auf.Referring now to Fig. 47, a sixth embodiment of the speech analyzer having features of the present invention applied to an English / German-Japanese automatic translation device will be described. This embodiment has an input processing section 6014 , and data is input to the section 6014 from an input unit 6012 . The input unit 6012 includes, for example, a keypad with character keys such as alphanumeric and function keys, an optical character reader for reading the English / German text recorded on paper, and a magnetic disk reader.

Der Eingabeverarbeitungsabschnitt 6014 hat einen Puffer 6014a für eine eingegebene Zeichenfolge (Wortfolge) und speichert den von der Einrichtung 6012 eingegebenen englischen/deutschen Satz in dem Puffer 6014a. Der Abschnitt 6014 liest den in dem Puffer 6014a gespeicherten Satz aus und gibt ihn an einen Einheiten-Aufteilabschnitt 6016 ab. Der Abschnitt 6016 ist ein Funktionsabschnitt, welcher die Wörterbuch- Bezugseinheit aus dem Eingabesatz von dem Abschnitt 6014 durch das Abrufen einer Abgrenzungstabelle 6018 aufteilt. Die Tabelle 6018 enthält Abgrenzungen, wie Zwischenräume, Kommata, usw.The input processing section 6014 has a buffer 6014 for a an input character string (word string), and stores the input from the device 6012 English / English sentence in the buffer 6014 a. The section 6014 reads out the sentence stored in the buffer 6014 a and delivers it to a unit partitioning section 6016 . The section 6016 is a functional section that divides the dictionary reference unit from the input sentence of the section 6014 by retrieving a delimitation table 6018 . Table 6018 contains boundaries such as spaces, commas, etc.

Der Abschnitt 6016 liest die Abgrenzungen aus der Tabelle 6018 aus und teilt den von dem Abschnitt 6014 aus eingegebenen Satz in Zeichenfolgen als die Einheiten zum Abfragen eines Bezugswörterbuchs 6020 auf, indem der Satz in Teile unterteilt wird, wo die Abgrenzungen vorhanden sind. Die aufgeteilten Zeichenfolgen werden in einen Wörterbuch- Abrufabschnitt 6022 eingegeben.The section 6016 reads out the delimitations from the table 6018 and divides the sentence input from the section 6014 into strings as the units for retrieving a reference dictionary 6020 by dividing the sentence into parts where the boundaries exist. The divided strings are input to a dictionary retrieval section 6022 .

Der Abschnitt 6022 ruft das Bezugswörterbuch 6020 für den eingegebenen Satz ab, der von dem Abschnitt 6016 abgegeben worden ist, der in Bezugseinheiten unterteilt ist. Das Bezugswörterbuch 6020 enthält beispielsweise, wie in Fig. 48 dargestellt, Einträge für die Zeichenfolgen, deren Sprachteil, Merkmalsinformation usw. des englischen/deutschen Satzes. Das Bezugswörterbuch 6020 enthält zusätzlich zu den in der Fig. dargestellten Eigennamen die Zeichenfolgen für einen anderen Sprachteil, beispielsweise für Verben, Adjektiva, usw. Der Eigenname als Sprachteil in dieser Fig. bedeutet, daß sie in Verbindung mit der registrierten Eigennamenverteilung angewendet werden, was später noch beschrieben wird, aber drückt keinen üblichen grammatikalischen Eigennamen aus. Ferner zeigt die Merkmalsinformation an, was der betreffende Eigenname ausdrückt und ist nicht nur auf einen beschränkt.The section 6022 retrieves the input sentence reference dictionary 6020 which has been submitted from the section 6016 divided into reference units. For example, as shown in Fig. 48, the reference dictionary 6020 includes entries for the character strings, their language part, feature information, etc. of the English sentence. The reference dictionary 6020 includes, in addition to the proper names shown in the figure , the character strings for another part of speech, for example, verbs, adjectives, etc. The proper name as the part of speech in this figure means that they are used in conjunction with the registered proper-name distribution will be described later, but does not express a common grammatical proper name. Further, the feature information indicates what the proper proper name expresses and is not limited to only one.

Der Abschnitt 6022 ruft das Bezugswörterbuch 6020 für die Zeichenfolge ab, die in die Bezugseinheit unterteilt ist, und wenn die Zeichenfolge ein Eigenname ist, gibt sie ihn an einen Abschnitt 6024 ab, um die Eigennamen-Verarbeitung durchzuführen, was später beschrieben wird. Ferner wird sie, wenn sie kein Eigenname ist, an einen Verarbeitungsabschnitt 6036 abgegeben und wird in der Tabelle 6036a in dem Verarbeitungsabschnitt 6036 gespeichert. Der Abschnitt 6024 weist einen das vorhergehende Satzende verarbeitende Abschnitt 6026, einen den vorhergehenden Eigennamen verarbeitenden Abschnitt 6028 und einen den Eigennamen an sich verarbeitenden Abschnitt 6030 auf.The section 6022 retrieves the reference dictionary 6020 for the character string which is divided into the reference unit, and if the character string is a proper name, it delivers it to a section 6024 to perform the proper name processing, which will be described later. Further, if it is not a proper name, it is delivered to a processing section 6036 and is stored in the table 6036 a in the processing section 6036 . The section 6024 has a preceding sentence end processing section 6026 , a preceding proper name processing section 6028, and a self-name processing section 6030 .

In dem Abschnitt 6026 wird beurteilt, ob eine Zeichenfolge, welche der eingegebenen Zeichenfolge vorangeht, welche aus dem Abschnitt 6022 abzurufen ist, am Ende des Satzes liegt oder nicht, und wenn die vorhergehende Zeichenfolge sich am Satzende befindet, wird sie an den Abschnitt 6022 abgegeben, nachdem der groß geschriebene Buchstabe am Anfang der zu verarbeitenden Zeichenfolge in einen kleinen Buchstaben umgewandelt ist und bewirkt wird, daß der Abruf-Abschnitt erneut das Bezugswörterbuch 6020 abruft. Die beim zweiten Abruf nicht abgerufene Zeichenfolge wird als ein nicht registrierter Eigenname eingestuft, wird an den Verarbeitungsabschnitt 6036 abgegeben und in dessen Tabelle 6036a aufbewahrt. Wenn die der eingegebenen Zeichenfolge vorausgehende Zeichenfolge sich nicht am Satzende befindet, wird sie an den Verarbeitungsabschnitt 6036 als ein Eigenname abgegeben, dessen Merkmalsinformation unbekannt ist, und wird in der Tabelle 6036a registriert, wie später noch beschrieben wird. In the section 6026 , it is judged whether or not a string preceding the input string to be fetched from the section 6022 is at the end of the sentence, and if the preceding string is at the sentence end, it is delivered to the section 6022 after the capitalized letter at the beginning of the string to be processed is converted to a small letter and the retrieval section is caused to retrieve the reference dictionary 6020 again. The string not fetched at the second fetch is classified as an unregistered proper name, is delivered to the processing section 6036 and stored in its table 6036a . If the string preceding the input string is not at the end of the sentence, it is delivered to the processing section 6036 as a proper name whose feature information is unknown, and is registered in the table 6036a as will be described later.

Der Abschnitt 6028 zergliedert die Merkmalsinformation für die von dem Abschnitt 6026 abgegebene, vorhergehende Zeichenfolge und gibt das Ergebnis an den Abschnitt 6030 ab. Der den Eigennamen an sich verarbeitende Abschnitt 6030 überprüft die Merkmalsinformation für den zu zergliedernden bzw. zu analysierenden Eigennamen und, wenn die Merkmalsinformation nicht entweder als der Eigenname oder der vorhergehende Eigenname registriert ist, analysiert sie den Eigenamen und den vorhergehenden Eigennamen mit Hilfe der registrierten Merkmalsinformation des anderen Eigennamens und speichert dieses Ergebnis in der Tabelle 6036a in dem Verarbeitungsabschnitt 6036.Section 6028 parses the feature information for the previous string output from section 6026 and returns the result to section 6030 . The self-name processing section 6030 checks the feature information for the proper name to be parsed and, if the feature information is not registered as either the proper name or the preceding proper name, analyzes the own name and the previous proper name using the registered feature information of the other proper name and stores this result in the table 6036a in the processing section 6036 .

Der Verarbeitungsabschnitt 6036 hat die Wörterbuch-Informations- Konservierungstabelle 6036a, speichert die von dem Abschnitt 6028 oder 6032 abgegebenen Daten in die Tabelle 6036a und liest dann die auf diese Weise gespeicherten Daten aus und gibt sie an einen eine syntaktische Analyse durchführenden Abschnitt 6038 ab, welcher die Strukturanalyse für den eingegebenen Satz durchführt, sie aus der Tabelle 6036a ausliest und sie einer morphologischen Analyse unterzieht.The processing section 6036 has the dictionary information preservation table 6036 a, stores the output from the section 6028 or 6032 data in the table 6036 a, and then reads out the thus stored data and outputs it to a syntactic analysis performing portion 6038 from performing the structural analysis for the input sentence, reading it from the table 6036a and subjecting it to morphological analysis.

Die Arbeitsweise der Einrichtung wird unter Bezugnahme auf das in Fig. 49 dargestellte Flußdiagramm erläutert. Zuerst wird ein deutscher/englischer Eingangsatz von der Eingabeeinrichtung 6012 in den Eingabeverarbeitungsabschnitt 6014 gelesen (6100). Der in dem Abschnitt 6014 eingelesene Eingangssatz wird in dem Puffer 6014a gespeichert, und der im Puffer 6014a gespeicherte Eingangssatz wird an den Einheiten- Aufteilabschnitt 6016 ausgelesen.The operation of the device will be explained with reference to the flowchart shown in FIG . First, a German / English input sentence is read from the input device 6012 into the input processing section 6014 ( 6100 ). The input sentence read in the section 6014 is stored in the buffer 6014 a, and the input sentence stored in the buffer 6014 a is read out to the unit dividing section 6016 .

Wenn der Eingangssatz eingelesen ist, liest der Abschnitt 6016 Abgrenzungen aus der entsprechenden Tabelle 6018 aus, um eine Aufteilung für die Wörterbuch-Bezugseinheiten durchzuführen (6102). Das heißt, die Zeichenfolgen, welche den eingegebenen Eingangssatz darstellen, werden nacheinander von dessen Anfang an in Abrufschlüssel-Zeichenfolgen als die Einheiten beim Abrufen des Bezugswörterbuchs 6020 aufgeteilt, indem sie an den Stellen geteilt werden, wo die Abgrenzungen, wie beispielsweise Zwischenräume und Doppelpunkte, vorhanden sind. Es wird beurteilt, ob die aufgeteilten Bezugseinheiten, d. h. die Abrufschlüssel-Zeichenfolgen, beendet worden sind oder nicht (6104) und wenn noch eine (nicht beendete) Abruf-Zeichenfolge vorhanden ist, wird diese an den Wörterbuch-Abrufabschnitt 6022 abgegeben.When the input sentence is read, section 6016 reads out boundaries from the corresponding table 6018 to perform a division for the dictionary reference units ( 6102 ). That is, the strings representing the input sentence input are successively divided into fetch key strings from the beginning thereof as the units retrieving the reference dictionary 6020 by dividing them at the locations where the boundaries such as spaces and colons, available. It is judged whether or not the divided reference units, that is, the fetch key strings, have been completed ( 6104 ), and if there is still a (unfinished) fetch string, it is delivered to the dictionary retrieving section 6022 .

Wenn die Abruf-Zeichenfolge an den Abschnitt 6022 abgegeben ist, ruft dieser (6022) das Bezugswörterbuch 6020 für die Abruf-Zeichenfolge ab (6106). Es wird dann beurteilt, ob die Abruf-Zeichenfolge in dem Eintrag des Bezugswörterbuchs 6020 vorhanden ist oder nicht, wie in Fig. 48 dargestellt ist (6108) und wenn es eine Eintragung gibt, wird eine Sprachteil- Information, die in dem Bezugswörterbuch 6020 gespeichert ist, ausgelesen und es wird beurteilt, ob die Abruf-Zeichenfolge ein Eigenname ist oder nicht (6110).When the fetch string is submitted to section 6022 , it retrieves ( 6022 ) the fetch string reference dictionary 6020 ( 6106 ). It is then judged whether or not the retrieval string is present in the entry of the reference dictionary 6020 as shown in Fig. 48 ( 6108 ), and if there is an entry, a part of speech information stored in the reference dictionary 6020 becomes is read out, and it is judged whether the fetch string is a proper name or not ( 6110 ).

Wenn die Abruf-Zeichenfolge kein Eigenname ist, gibt der Abschnitt 6022 die aus dem Bezugswörterbuch 6020 ausgelesenen Daten an den Verarbeitungsabschnitt 6036 ab, und speichert sie in der Tabelle 6036a (6112). Wenn die Daten in der Tabelle 6036a gespeichert sind, werden ein Eintrag, der anzeigt, daß die Daten gespeichert sind, und die Daten für die Abruf-Zeichenfolge, die unmittelbar vorher gespeichert worden sind, von dem Verarbeitungsabschnitt 6036 in den Abschnitt 6016 eingegeben. Folglich wird wieder auf den Schritt 6012 zurückgegangen, und das Aufteilen für Wörterbuchbezugseinheiten wird in dem Abschnitt 6016 durchgeführt.If the fetch string is not a proper name, the portion 6022 outputs the data read from the reference dictionary 6020 to the processing section 6036 and stores it in the table 6036a ( 6112 ). When the data is stored in the table 6036a , an entry indicating that the data is stored and the data for the fetch string stored immediately before are input from the processing section 6036 to the section 6016 . As a result, step 6012 is returned to, and the dictionary reference unit dividing is performed in section 6016 .

Wenn beim Schritt 6110 die Abruf-Zeichenfolge ein Eigenname ist, gibt der Abschnitt 6022 den aus dem Bezugswörterbuch 6020 ausgelesenen Eigennamen zusammen mit den Daten der vorherigen, von der Tabelle 6036a aus eingegebenen Abruf-Zeichenfolge mit Hilfe des Abschnitts 6016 an den Wörterbuch-Abfrageabschnitt 6022 und den Abschnitt 6024 ab, wo die Verarbeitung für den im Wörterbuch registrierten Eigennamen durchgeführt wird (6114).At step 6110, if the retrieval string is a proper name, the portion 6022 outputs the proper name read from the reference dictionary 6020 to the dictionary retrieval section together with the data of the previous retrieval string input from the table 6036a by means of the section 6016 6022 and the section 6024 where the processing for the proper name registered in the dictionary is performed ( 6114 ).

Nunmehr wird unter Bezugnahme auf das in Fig. 50 dargestellte Flußdiagramm die Verarbeitung für den in dem Wörterbuch registrierten Eigennamen durchgeführt. Die Daten, die von dem Abschnitt 6022 an den Abschnitt 6024 abgegeben worden sind, werden mittels des das vorhergehende Satzende verarbeitenden Abschnitts 6026 an den den vorhergehenden Eigennamen verarbeitenden Abschnitt 6028 abgegeben. Bei der Verarbeitung des im Wörterbuch registrierten Eigennamens hat der Abschnitt 6026 keine Funktion.Now, with reference to the flowchart shown in Fig. 50, the processing for the proper name registered in the dictionary is performed. The data output from the section 6022 to the section 6024 is output to the previous proper name processing section 6028 by the preceding sentence end processing section 6026 . When processing the proper name registered in the dictionary, section 6026 has no function.

In dem Abschnitt 6028 wird dann beurteilt, ob die Abruf- Zeichenfolge die dem Eigennamen vorangeht, ein nicht in dem Bezugswörterbuch 6020 registrierter Eigenname ist oder nicht, d. h. ob es ein Eigenname, welcher der Verarbeitung für den nicht im Wörterbuch registrierten Eigennamen unterworfen ist (6200), ist oder nicht, wie später noch beschrieben wird. Falls es ein nicht registrierter Eigenname ist, wird der gesamte Teil des Eigennamens und der vorhergehende nichtregistrierte Eigenname als ein Eigenname beurteilt, welcher die Merkmalsinformation eines Eigennamens hat (6202); die Daten werden dann an den Verarbeitungsabschnitt 6036 abgegeben und in dessen Tabelle 6036a gespeichert (6214).In the section 6028 , it is then judged whether or not the retrieval string precedes the proper name, a proper name not registered in the reference dictionary 6020 , that is, a proper name subjected to the processing for the proper name not registered in the dictionary ( 6200 ), or not, as will be described later. If it is an unregistered proper name, the entire part of the proper name and the previous unregistered proper name are judged as a proper name having the feature information of a proper name ( 6202 ); the data is then output to processing section 6036 and stored in its table 6036a ( 6214 ).

Wenn in dem Verarbeitungsabschnitt 6028 die Abruf-Zeichenfolge, welche den Eigennamen vorangeht, als ein nicht registrierter Eigenname beurteilt wird, wird beim Schritt 6200 beurteilt, ob die Abruf-Zeichenfolge, die dem Eigennamen vorangeht, ein in dem Bezugswörterbuch 6020 registrierter Eigenname ist oder nicht (62049). Wenn die Abruf-Zeichenfolge, welche dem Eigennamen vorangeht, ein registrierter Eigenname ist, wird beurteilt, ob die Merkmalsinformation des vorangehenden Eigennamens unbekannt ist oder nicht, d. h. ob er in dem Bezugswörterbuch 6020 registriert ist oder nicht (6206).In the processing section 6028 , when the retrieval string preceding the proper name is judged to be a non-registered proper name, it is judged at step 6200 whether or not the retrieval string preceding the proper name is a proper name registered in the reference dictionary 6020 ( 62049 ). If the retrieval string preceding the proper name is a registered proper name, it is judged whether or not the feature information of the previous proper name is unknown, ie, whether it is registered in the reference dictionary 6020 or not ( 6206 ).

Wenn die Merkmalsinformation des vorherigen Eigennamens unbekannt ist, wird im Flußdiagramm auf den Schritt 6202 vorgerückt, wo dann der gesamte Teil des Eigennamens und des vorhergehenden Eigennamens als ein einziger Eigenname betrachtet wird, welcher die Merkmalsinformation des Eigennamens hat (6202); der Verarbeitungsabschnitt 6028 gibt dann Daten von dem Verarbeitungsabschnitt 6036 ab, wo sie in dessen Tabelle 6036a gespeichert werden (6214).If the feature information of the previous proper name is unknown, the flow advances to step 6202 where the entire part of the proper name and the previous proper name are considered as a single proper name having the feature name of the proper name ( 6202 ); the processing section 6028 then outputs data from the processing section 6036 where it is stored in its table 6036 a ( 6214 ).

Wenn in dem Verarbeitungsabschnitt 6028 die Merkmalsinformation des vorhergehenden Eigennamens nicht als unbekannt beurteilt wird, das heißt, wenn beurteilt wird, daß sie in dem Bezugswörterbuch 6020 registriert ist, werden die Daten von dem Verarbeitungsabschnitt 6028 an den Abschnitt 6030 abgegeben. In dem Abschnitt 6030 wird dann beurteilt, ob die Merkmalsinformation des Eigennamens bekannt ist oder nicht (6208). Für den Fall, daß die Merkmalsinformation für den Eigennamen unbekannt ist, beurteilt der Abschnitt 6030 den ganzen Teil des Eigennamens und des vorhergehenden Eigennamens als einen Eigennamen, welcher die Merkmalsinformation des vorhergehenden Eigennamens hat (6210) und gibt die Daten an den Abschnitt 6036 ab, in dessen Tabelle 6036a sie aufgezeichnet werden (6214).In the processing section 6028 , if the feature information of the previous proper name is not judged to be unknown, that is, if judged to be registered in the reference dictionary 6020 , the data is output from the processing section 6028 to the section 6030 . In the section 6030 , it is then judged whether the feature information of the proper name is known or not ( 6208 ). In the case where the feature information for the proper name is unknown, the section 6030 judges the whole part of the proper name and the previous proper name as a proper name having the feature information of the previous proper name ( 6210 ) and outputs the data to the section 6036 , in whose table 6036 a they are recorded ( 6214 ).

Wenn in dem Abschnitt 6030 festgestellt wird, daß die Merkmalsinformation des Eigennamens nicht unbekannt ist, das heißt, daß sie in dem Bezugswörterbuch 6020 registriert ist, beurteilt der Verarbeitungsabschnitt 6030 den Eigennamen als einen Eigennamen mit einer Merkmalsinformation, welche aus dem Bezugswörterbuch 6020 unabhängig von dem vorhergehenden Eigennamen abgerufen worden ist (6212) und gibt die Daten an den Verarbeitungsabschnitt 6036 ab, in dessen Tabelle 6036a die Daten dann aufgezeichnet werden (6214). Wenn nunmehr wieder in Fig. 49 keine Abruf-Zeichenfolge an dem Eingang des Bezugswörterbuchs 6020 beim Schritt 6108 vorhanden ist, wird beurteilt, ob das erste Zeichen in der Zeichenfolge ein groß geschriebener Buchstabe ist oder nicht (6116); wenn es kein groß geschriebener Buchstabe ist, beurteilt der Abschnitt 6022 die Abruf-Zeichenfolge als ein nicht registriertes Wort, gibt es an den Abschnitt 6036 ab und speichert es in dessen Tabelle 6036a (6118).When it is determined in the section 6030 that the feature information of the proper name is not unknown, that is, registered in the reference dictionary 6020 , the processing section 6030 judges the proper name as a proper name with feature information that is independent of the reference dictionary 6020 preceding proper name ( 6212 ) and outputs the data to the processing section 6036 in whose table 6036a the data is then recorded ( 6214 ). Returning now to Fig. 49, if no fetch string is present at the input of the reference dictionary 6020 at step 6108 , it is judged whether or not the first character in the string is a capitalized letter ( 6116 ); if it is not a capitalized letter, section 6022 assesses the fetch string as an unregistered word, submits it to section 6036 , and stores it in its table 6036a ( 6118 ).

Wenn das erste Zeichen ein groß geschriebener Buchstabe ist, werden die Daten für die Abruf-Zeichenfolge zusammen mit den Daten für die vorherige Abruf-Zeichenfolge von dem Abschnitt 6022 an den Abschnitt 6024 abgegeben, in welchem die Verarbeitung für einen nicht registrierten Eigennamen durchgeführt wird (6120).If the first character is a capitalized letter, the data for the fetch string, along with the data for the previous fetch string, is delivered from the portion 6022 to the portion 6024 where the unregistered proper name processing is performed ( 6120 ).

Nunmehr wird anhand von Fig. 51 die Verarbeitung für einen nicht im Wörterbuch registrierten Eigennamen beschrieben. Die Daten für die Abruf-Zeichenfolge werden zusammen mit den Daten für den vorherigen Eintrag, der in der Tabelle aufgezeichnet worden ist, an den das vorherige Satzende verarbeitenden Abschnitt 6026 abgegeben, wenn beurteilt wird, ob das Ende des vorherigen, in der Tabelle aufgezeichneten Eintrags ein Kandidat für das Satzende ist oder nicht (6300). Diese Beurteilung bezüglich des Kandidaten für das Satzende wird gemacht, um zu beurteilen, ob das Ende der vorherigen, in der Tabelle aufgezeichneten Eintragung ein Kandidat für das Ende des Satzes, beispielsweise ein separater Punkt (.), usw. ist oder nicht.Now, with reference to Fig. 51, the processing for a proper name not registered in the dictionary will be described. The data for the retrieval string is output to the previous sentence end processing section 6026 together with the data for the previous entry recorded in the table, when it is judged whether the end of the previous entry recorded in the table a candidate for the end of the sentence is or is not ( 6300 ). This sentence end candidate judgment is made to judge whether the end of the previous entry recorded in the table is a candidate for the end of the sentence, for example, a separate point (.), Etc. or not.

Wenn das Ende des vorhergehenden, in der Tabelle aufgezeichneten Eintrags ein Kandidat für das Satzende ist, werden Daten von dem Abschnitt 6026 an den den vorhergehenden Eigennamen verarbeitenden Abschnitt 6028 abgegeben, und der Abschnitt 6028 beurteilt dann den vorhergehenden, in der Tabelle aufgezeichneten Eintrag als das Satzende (6302) und gibt ihn an den Wörterbuch-Abrufabschnitt 6022 ab, nachdem der groß geschriebene Buchstabe am Anfang der Abruf-Zeichenfolge in einen kleinen Buchstaben geändert worden ist.When the end of the previous entry recorded in the table is a candidate for the end of sentence, data is output from the section 6026 to the previous proper name processing section 6028 , and the section 6028 then judges the previous entry recorded in the table as the End of sentence ( 6302 ) and returns it to the dictionary retrieval section 6022 after the capitalized letter at the beginning of the retrieval string has been changed to a small letter.

Der Abschnitt 6022 ruft das Bezugswörterbuch 6020 für die Abruf-Zeichenfolge ab, welche für den kleinen Buchstaben wieder ausgebildet worden ist (6304) und beurteilt, ob es ein Eintrag in dem Bezugswörterbuch 6020 ist (3603). Wenn ein Eintrag vorliegt, gibt der Abschnitt 6022 die aus dem Bezugswörterbuch 6020 abgerufenen Daten an den Verarbeitungsabschnitt 6036 ab und speichert sie in der Tabelle 6036a (6308). Wenn kein Eintrag vorliegt, kehrt der Abschnitt 6022 zu dem ersten Zeichen in der Abruf-Zeichenfolge auf den Großbuchstaben zurück, gibt ihn als einen nichtregistrierten Eigennamen an den Verarbeitungsabschnitt 6036 ab und speichert ihn in der Tabelle 6036a (6310). Wenn beim Schritt 6300 der Verarbeitungsabschnitt 6026 beurteilt, daß das Ende der vorhergehenden in der Tabelle aufgezeichneten Eintrags nicht ein Kandidat für das Satzende ist, werden die Daten von dem Abschnitt 6026 an den Verarbeitungsabschnitt 6028 abgegeben, und der Abschnitt 6028 beurteilt dann, daß die in der Tabelle aufgezeichnete, vorherige Eingabe nicht das Ende des Satzes ist (6312). Die Daten werden dann von dem Abschnitt 6028 an den Abschnitt 6030 abgegeben, und dieser Abschnitt 6030 beurteilt dann die Abruf-Zeichenfolge als einen Eigennamen, dessen Merkmalsinformation unbekannt ist (6314).The section 6022 retrieves the retrieval string reference dictionary 6020 that has been reconstituted for the small letter ( 6304 ) and judges whether it is an entry in the reference dictionary 6020 ( 3603 ). If there is an entry, the section 6022 outputs the data retrieved from the reference dictionary 6020 to the processing section 6036 and stores it in the table 6036a ( 6308 ). If there is no entry, section 6022 returns to the first character in the fetch string uppercase, returns it as a non-registered proper name to processing section 6036 , and stores it in table 6036a ( 6310 ). If at step 6300, the processing section 6026 judges that the end of the previous data recorded in the table entry is not a candidate for the end of the block, the data from the section 6026 to the processing section 6028 to be dispensed, and the portion 6028 then judges that the in Previous entry not recorded in the table is the end of the sentence ( 6312 ). The data is then passed from section 6028 to section 6030 , and this section 6030 then judges the retrieval string as a proper name whose feature information is unknown ( 6314 ).

Der Abschnitt 6030 kehrt dann zu den Daten in dem Abschnitt 6028 zurück, und die Verarbeitung für den im Wörterbuch registrierten Eigennamen wird in dem Abschnitt 6028 durchgeführt (6316). Die Verarbeitung für den im Wörterbuch registrierten Eigennamen ist dieselbe wie diejenige, welche in Fig. 50 dargestellt ist.The section 6030 then returns to the data in the section 6028 and the processing for the dictionary-registered proper name is performed in the section 6028 ( 6316 ). The processing for the proper name registered in the dictionary is the same as that shown in FIG .

Wenn nunmehr in Fig. 49 die Wörterbuch-Bezugseinheiten bei dem Schritt 6104 beendet werden, werden die in der Tabelle 6036a aufgezeichneten Daten von dem Abschnitt 6036 an den Abschnitt 6038 abgegeben (6122), wodurch die morphologische Analyse gemäß dieser Ausführungsform beendet ist. Die Arbeitsweise dieser Ausführungsform, die vorstehend beschrieben worden ist, wird nunmehr anhand eines eingegebenen Satzes erläutert.If the dictionary reference units are completed in step 6104 is now in Fig. 49, in the table 6036 a recorded data from the section 6036 to the section 6038 are submitted (6122), whereby the morphological analysis is completed according to this embodiment. The operation of this embodiment, which has been described above, will now be explained with reference to an input sentence.

Bei der Erläuterung wird auf Fig. 52 Bezug genommen, wobei beispielsweise ein Eingangssatz "Im Bahnhof Tokyo Mr. Walter . . ." ("In Tokyo Station Mr. Walter . . .") eingegeben wird. Zuerst wird eine Eingangsverarbeitung 1100 durchgeführt, in dem der Eingangssatz in den Verarbeitungsabschnitt 6014 eingelesen wird. Dann wird die Wörterbuch-Aufteileinheit vorgenommen (6102), indem der eingegebene Satz durch Zwischenräume in entsprechende Worte aufgeteilt wird. Zuerst wird das Bezugswörterbuch 6020 für "Im bzw. In" abgerufen (6106). Es ist kein Eintrag für "Im bzw. In" in dem Bezugswörterbuch 6020 vorhanden. Wenn der Schritt auf die Verarbeitung für einen nicht im Wörterbuch registrierten Eigennamen vorgerückt ist, da der vorhergehende Teil als der Anfang des Satzes (Die Oberseite der Datei) erkannt wird, wird "Im bzw. In" in "im bzw. in" umgewandelt. Da "im bzw. in" einen Eintrag in dem Bezugswörterbuch 6020 hat und kein Eigenname ist (6110) werden die aus dem Bezugswörterbuch 6020 abgerufenen Daten in der Tabelle 6036a aufgezeichnet (6112).In the explanation, reference will be made to Fig. 52 where, for example, an introductory sentence "At the Tokyo railway station Mr. Walter. ("At Tokyo Station Mr. Walter ...") is entered. First, an input processing 1100 is performed in which the input sentence is read into the processing section 6014 . Then, the dictionary dividing unit is made ( 6102 ) by dividing the input sentence by intervals into corresponding words. First, the reference dictionary 6020 for "In" is retrieved ( 6106 ). There is no entry for "Im or In" in the reference dictionary 6020 . If the step has advanced to processing for a non-dictionary registered proper name, since the previous portion is recognized as the beginning of the sentence (top of the file), "In or In" is converted to "in". Since "in" has an entry in the reference dictionary 6020 and is not a proper name ( 6110 ), the data retrieved from the reference dictionary 6020 is recorded in the table 6036a ( 6112 ).

Dann wird das Bezugswörterbuch 6020 bezüglich "Tokyo" abgerufen (6106). Da es keinen Eintrag für "Tokyo" in dem Bezugswörterbuch 6020 gibt (6108) und das erste Zeichen ein Großbuchstabe ist (6116), wird eine Verarbeitung für einen nicht im Wörterbuch registrierten Eigennamen durchgeführt (6120). Dann wird auf Fig. 51 vorgerückt. Da der vorhergehende Teil "Im bzw. In" ist und kein Kandidat für das Satzende ist (6300) wird "Im bzw. In" erkannt, daß dies nicht das Satzende ist (6312); "Tokyo" wird als ein Eigenname erkannt, dessen Merkmalsinformation unbekannt ist (6314) und die Verarbeitung für einen im Wörterbuch registrierten Eigennamen wird durchgeführt (6316). Dann wird auf Fig. 50 vorgerückt. Da das vorhergehende "Im bzw. In" weder ein registrierter Eigenname noch ein nichtregistrierter Eigenname ist (6204) wird "Tokyo" allein als ein Eigenname mit einem eigenen Informationsmerkmal d. h. als ein Eigenname aufgezeichnet, dessen Merkmalsinformation unbekannt ist (6216), dann wird beim nächsten Schritt auf Fig. 49 zurückgegangen, und das Bezugswörterbuch 6020 wird für "Bahnhof bzw. Station" abgerufen (6106). Da ein Eintrag für "Bahnhof bzw. Station" in dem Bezugswörterbuch 6020 vorhanden ist (6108), und dies ein Eigenname ist (6110), wird eine Verarbeitung für einen im Wörterbuch registrierten Eigennamen durchgeführt (6114). Beim nächsten Schritt wird auf Fig. 50 vorgerückt. Da das vorhergehende "Tokyo" ein nichtregistrierter Eigenname ist (6200), wird der gesamte Teil für "Bahnhof Tokyo bzw. Tokyo Station" als ein Eigenname aufgezeichnet, der die Merkmalsinformation "Ort, in Form von "Bahnhof bzw. Station" hat (6206).Then, the reference dictionary 6020 is retrieved for "Tokyo" ( 6106 ). Since there is no entry for "Tokyo" in the reference dictionary 6020 ( 6108 ) and the first character is a capital letter ( 6116 ), processing for a proper name not registered in the dictionary is performed ( 6120 ). Then, advance to FIG. 51. Since the previous part is "In" and is not a candidate for the end of sentence ( 6300 ), "Im or In" is recognized to be not the end of sentence ( 6312 ); "Tokyo" is recognized as a proper name whose feature information is unknown ( 6314 ), and processing for a dictionary-registered proper name is performed ( 6316 ). Then, advance to FIG. 50. Since the previous "Im or In" is neither a registered proper name nor an unregistered proper name ( 6204 ), "Tokyo" alone is recorded as a proper name with its own information feature, ie, a proper name whose feature information is unknown ( 6216 ), then the next step to Fig. 49 decreased, and the reference dictionary 6020 is retrieved for "station or station" (6106). Since there is an entry for "station" in the reference dictionary 6020 ( 6108 ), and this is a proper name ( 6110 ), processing for a dictionary-registered proper name is performed ( 6114 ). In the next step, Fig. 50 is advanced. Since the previous "Tokyo" is a non-registered proper name ( 6200 ), the entire part for "Tokyo Station" is recorded as a proper name having the feature information "place, in the form of" station "( 6206 ).

Dann wird das Bezugswörterbuch 6020 in Fig. 49 für "Mr." abgerufen (6016). Da es einen Eintrag für "Mr." in dem Bezugswörterbuch 6020 gibt und es ein Eigenname ist (6110), wird eine Verarbeitung für einen im Wörterbuch registrierten Eigennamen durchgeführt (6114). Dann wird beim nächsten Schritt auf Fig. 50 vorgerückt. Der vorhergehende Ausdruck "Bahnhof bzw. Station" ist ein nicht registrierter Eigenname (6200), aber ein registrierter Eigenname (6200) und die Merkmalsinformation "Ort" ist nicht unbekannt (6206). Da "Mr." die Merkmalsinformation "nicht unbekannt ist" (6208), wird "Mr." allein als ein Eigenname mit der Merkmalsinformation "Person" registriert (6212).Then, the reference dictionary 6020 in Fig. 49 for "Mr." retrieved ( 6016 ). Since there is an entry for "Mr." in the reference dictionary 6020 and it is a proper name ( 6110 ), processing for a dictionary registered proper name is performed ( 6114 ). Then, in the next step, FIG. 50 is advanced. The previous term "station" is an unregistered proper name ( 6200 ), but a registered proper name ( 6200 ) and the feature information "location" is not unknown ( 6206 ). Since "Mr." the feature information "not unknown" ( 6208 ), becomes "Mr." alone registered as a proper name with the feature information "Person" ( 6212 ).

Nunmehr wird dann wieder in Fig. 49 das Bezugswörterbuch 6020 für "Walter" abgerufen (6016). Da "Walter" einen Eintrag in dem Bezugswörterbuch 6020 hat (6108) und es ein Eigenname ist (6110), wird eine Verarbeitung für einen im Wörterbuch registrierten Eigennamen durchgeführt (6114). Dann wird beim nächsten Schritt auf Punkt 50 vorgerückt. Da das vorhergehende "Mr." ein nicht registrierter Eigenname ist (6200), aber ein registrierter Eigenname (6204) mit der Merkmalsinformation für "Person" nicht unbekannt ist (6202), während die Merkmalsinformation für "Walter" unbekannt ist (6208), werden "Mr. Walter" zusammengesetzt und als ein Eigenname mit der Merkmalsinformation "Person" aufgezeichnet (6210).Now, again in Fig. 49, the reference dictionary 6020 for "Walter" is retrieved ( 6016 ). Since "Walter" has an entry in the reference dictionary 6020 ( 6108 ) and it is a proper name ( 6110 ), processing for a dictionary-registered proper name is performed ( 6114 ). Then the next step advances to point 50 . Since the previous "Mr." an unregistered proper name is ( 6200 ), but a registered proper name ( 6204 ) having the feature information for "Person" is not unknown ( 6202 ), while the feature information for "Walter" is unknown ( 6208 ), "Mr. Walter" is composed and recorded as a proper name with the feature information "person" ( 6210 ).

Wie oben beschrieben, wird in dieser Ausführungsform der deutsche/englische Eingangssatz in Abruf-Zeichenfolgen aufgeteilt, für welche das Bezugswörterbuch 6020 zuerst abgerufen wird. Wenn es einen Eintrag in Form eines Eigennamens in dem Bezugswörterbuch 6020 gibt, wird eine Verarbeitung für einen registrierten Eigennamen durchgeführt, wobei der vorhergehende, in der Tabelle aufgezeichnete Eintrag in Betracht gezogen wird.As described above, in this embodiment, the German / English input sentence is divided into fetching strings for which the reference dictionary 6020 is fetched first. If there is an entry in the form of a proper name in the reference dictionary 6020 , a registered proper name processing is performed taking the previous entry recorded in the table into consideration.

Wenn der vorhergehende, in der Tabelle aufgezeichnete Eintrag ein Eigenname ist, wird die Merkmalsinformation für den vorhergehenden, in der Tabelle aufgezeichneten Eintrag und für den aktuellen zu verarbeitenden Eigennamen überprüft. Wenn eines hiervon in der Merkmalsinformation fehlt, wird die Merkmalsinformation des anderen von ihnen vorgesehen, während wenn beide eine Merkmalsinformation haben, werden sie einzeln als Eigennamen mit zugehörenden Merkmalsinformationen erkannt.If the previous entry recorded in the table is a proper name, the feature information for the previous entry recorded in the table and checked for the current proper name to process. If one of these is missing in the feature information, the feature information of the other of them is provided while if both have feature information, They are individually as proper names with associated feature information recognized.

Folglich ist es möglich, einen Eigennamen, der keine Merkmalsinformation hat, in angemessener Weise mit einer Merkmalsinformation zu versehen, und die vorgesehene Merkmalsinformation in geeigneter Weise zu begrenzen. Dies ermöglicht eine wirksamere Analyse in der nachfolgenden Strukturanalyse und eine entsprechende Übersetzung.Consequently, it is possible to have a proper name that does not contain feature information has, appropriately, with feature information to provide, and the intended feature information to limit appropriately. this makes possible a more effective analysis in the subsequent structural analysis and a corresponding translation.

Ferner wird für die Zeichenfolge, die nicht in dem Bezugswörterbuch 6020 registriert ist, wenn das erste Zeichen ein Großbuchstabe ist und die vorhergehende Zeichenfolge als das Satzende beurteilt wird, das Bezugswörterbuch 6020 wieder abgerufen, nachdem der Großbuchstabe in einen Kleinbuchstaben geändert worden ist, und folglich ist es möglich, das Bezugswörterbuch 6020 auch für die Zeichenfolge am Anfang des Satzes abzurufen. Wenn ferner eine Zeichenfolge, die mit einem Großbuchstaben beginnt, an einer anderen Stelle, als am Anfang des Satzes erscheint, wird sie als ein Eigenname beurteilt, und die Merkmalsinformation des Eigennamens wird durch einen Eigennamen mit einer registrierten Merkmalsinformation versehen, welche davor oder danach vorhanden ist. Folglich kann ein Eigenname, der nicht in dem Bezugswörterbuch 6020 registriert ist, in gewissem Umfang grammatikalisch zergliedert bzw. analysiert werden.Further, for the character string which is not registered in the reference dictionary 6020 , if the first character is a capital letter and the previous character string is judged to be the sentence end, the reference dictionary 6020 is retrieved after the capital letter has been changed to a lower case, and hence It is also possible to retrieve the reference dictionary 6020 for the string at the beginning of the sentence. Further, when a character string starting with a capital letter appears at a position other than the beginning of the sentence, it is judged to be a proper name, and the feature information of the proper name is provided with a proper name information having registered feature information existing before or after it is. Thus, a proper name not registered in the reference dictionary 6020 can be parsed to some extent grammatically.

Nunmehr wird eine siebte Ausführungsform beschrieben. Hierbei ist in Fig. 54 der Gesamtaufbau der siebten Ausführungsform eines Sprachanalysators mit Merkmalen nach der Erfindung dargestellt, welcher bei einer automatischen Übersetzungseinrichtung für Englisch/Japanisch verwendet ist. Diese Ausführungsform hat einen Eingabeeinschnitt 7010, durch welchen ein ins Japanisch zu übersetzender, englischer Text 7012 eingegeben wird. Der Eingabeabschnitt 7010 kann beispielsweise ein Tastenfeld mit Zeichentasten, wie alphanumerischen oder Funktions-Tasten, eine optische Zeichenleseeinrichtung (OCR), welche den auf Papier aufgezeichneten englischen Text liest und/oder eine Dateispeichereinrichtung zum Lesen eines englischen Textes aufweisen, der auf ein Speichermedium, wie eine Magnetplatte aufgezeichnet ist.Now, a seventh embodiment will be described. Here, Fig. 54 shows the entire structure of the seventh embodiment of a speech analyzer having features of the invention used in an English / Japanese automatic translator. This embodiment has an input slot 7010 through which an English-to-translate English text 7012 is input. The input section 7010 may comprise , for example, a keypad with character keys such as alphanumeric or function keys, an optical character reader (OCR) which reads the English text recorded on paper, and / or a file memory means for reading English text onto a storage medium such as a magnetic disk is recorded.

Der von dem Eingabeabschnitt 7010 eingegebene, englische Text wird in einen Vorredigierabschnitt 7010 eingelesen, in welchem eine Übersetzungs-Vorbehandlung durchgeführt wird. In diesem Fall wird hauptsächlich eine Satzerkennung und eine Behandlung von unbekannten Worten durchgeführt. The input from the input section 7010, English text is read in a Vorredigierabschnitt 7010, in which a translation pretreatment is carried out. In this case, sentence recognition and treatment of unknown words are mainly performed.

Dies fungiert dann als ein Teil einer morphologischen Analyse. Die vorredigierten englischen Daten werden zusammen mit der bei der Vorredigierung erhaltenen Information in einem Abschnitt 7010 für eine morphologische Analyse übertragen. Der Abschnitt 7010 zergliedert die Morpheme des englischen Satzes, indem er sie durch Abrufen eines Wort-Wörterbuchs 7018 aufteilt, führt verschiedene Arten von Anordnungen oder Zusammenstellungen durch die eine Verarbeitung unbekannter Worte, einen Ausdruck für einen Eigennamen, Zeit, Zahl usw. und führt eine Verarbeitung für den ganzen Satz, wie eine Zusatzfrage und eine Appositionserkennung durch. Die morphologischen Analyseregeln hierfür sind in einer Regeldatei 7036 enthalten.This then acts as part of a morphological analysis. The pre-processed English data is transmitted together with the information obtained in the pre-editing in a section 7010 for a morphological analysis. The section 7010 dissects the morphemes of the English sentence by dividing them by retrieving a word dictionary 7018 , performs various kinds of arrangements or compilations by the processing of unknown words, expression for a proper name, time, number, etc., and carries one Processing for the whole sentence, such as a supplemental question and apposition recognition by. The morphological analysis rules for this are contained in a rule file 7036 .

Die englischen Daten, die der morphologischen Analyse unterzogen worden sind, werden zusammen mit der Wörterbuch-Information, welche durch die morphologische Analyse erhalten worden ist, an einen hierfür vorgesehenen Abschnitt I 7020 übertragen. Der Abschnitt 7020 ist ein Funktionsabschnitt, welcher die Oberflächenschicht-Struktur für den Satz grammatikalisch zergliedert, wobei Grammatikregeln auf englische Daten angewendet werden, und findet alle strukturellen Möglichkeiten. Die in dem Abschnitt I 7020 einer Analyse unterzogenen, englischen Daten werden dann zusammen mit der Analyseinformation an einen der syntaktischen Analyse dienenden Abschnitt II 7020 zugeführt, in welchem eine Lösung auf dem Ergebnis der vorherigen Analyse entsprechend der Oberflächenschicht-Struktur in dem Abschnitt I durch Anwenden einer strukturellen Beschreibung ausgewählt wird. Folglich wird ein plausibler Parsing- oder Analysebaum des englischen Satzes vorbereitet, und die Struktur wird dann gemacht. Die Analyseregeln sind in der Regeldatei 7036 gespeichert.The English data which has been subjected to the morphological analysis, together with the dictionary information obtained by the morphological analysis, is transferred to a dedicated section I 7020 . Section 7020 is a functional section that parses the surface layer structure for the sentence, applying grammatical rules to English data, and finds all structural possibilities. The English data analyzed in section I 7020 is then sent along with the analysis information to syntactic analysis section II 7020 where a solution is applied to the result of the previous analysis corresponding to the surface layer structure in section I a structural description is selected. Consequently, a plausible parsing or analysis tree of the English sentence is prepared, and the structure is then made. The analysis rules are stored in rules file 7036 .

Die englischen Daten werden nach Durchführung der syntaktischen bzw. morphologischen Analyse als die Daten für den Parsing- bzw. Analysebaum an einen Struktur-Umwandlungsabschnitt 7024 übertragen. Dieser (7024) bereitet einen entsprechenden japanischen Strukturbaum aus dem Strukturbaum vor, welcher eine englische Zwischenstruktur ist und wandelt ihn in eine dem japanischen unterliegende Struktur um, aus welcher ein japanischer Satz leicht übersetzt werden kann.The English data is transmitted to a texture conversion section 7024 after performing the syntactic analysis as the data for the parsing tree . This ( 7024 ) prepares a corresponding Japanese structure tree from the structure tree , which is an English intermediate structure, and converts it into a Japanese underlying structure, from which a Japanese sentence can be easily translated.

Die Strukturbaum-Daten, welche die dem Japanischen unterliegende, umgewandelte Struktur zeigen, werden an einen Übersetzungsabschnitt 7062 abgegeben, wo ein übersetzter Satz erzeugt wird. Dies ist ein Funktionsabschnitt zum Erzeugen eines japanischen Satzes aus dem japanischen Strukturbaum. Die Daten für das Japanische, die als ein übersetzter Satz ausgebildet sind, d. h. die übersetzten Satzdaten, werden an einen Nachre 87040 00070 552 001000280000000200012000285918692900040 0002003733674 00004 86921digierabschnitt 7030 abgegeben. Dieser Abschnitt 7030 modifiziert die Übersetzungs-Satzdaten bezüglich des Wörterbuchs 7018 mit Hilfe von Informationen, die bei der Übersetzung benutzt worden sind, um einen natürlicheren japanischen Satz zu vervollkommnen. Die Daten für den japanischen Satz werden an einen Ausgabeabschnitt 7032 übertragen und dann von diesem aus als ein übersetzter japanischer Satz 7034 abgegeben. Der Abgabeabschnitt 7032 kann beispielsweise ein Drucker, ein Display und/oder eine Speicherdateieinrichtung, wie eine Magnetplatte, aufweisen. Der Fluß einer Reihe von Übersetzungsverarbeitungen wird durch einen Steuerabschnitt 7038 gesteuert, welcher die Steuerung für die gesamte Einrichtung regelt.The structure tree data showing the Japanese underlying converted structure is delivered to a translation section 7062 where a translated sentence is generated. This is a functional section for generating a Japanese sentence from the Japanese structure tree. The data for the Japanese, which are formed as a translated sentence, that is, the translated sentence information is outputted to a Nachre 87040 00070 552 00004 001000280000000200012000285918692900040 0002003733674 86921digierabschnitt 7030th This section 7030 modifies the translation sentence data relative to the dictionary 7018 using information used in the translation to perfect a more natural Japanese sentence. The data for the Japanese sentence is transferred to an output section 7032 and then output therefrom as a translated Japanese sentence 7034 . The dispensing section 7032 may include, for example, a printer, a display, and / or a memory file device, such as a magnetic disk. The flow of a series of translation processes is controlled by a control section 7038 , which controls the control of the entire device.

In dem Wort-Wörterbuch 7018 werden in dieser Ausführungsform die Wörterbuchdaten für englische und japanische Worte gespeichert. Es wird ein Vokabular festgelegt, sowie eine verbindende Beziehung, d. h. eine gleichzeitig bestehende Beziehung oder verschiedene Informationen, wie Bedeutungen einer Singular- oder Pluralform, ein Sprachteil, usw. Ferner werden Regeldaten für eine morphologische und syntaktische Analyse in der Regeldatei 7036 gespeichert. Der Steuerabschnitt 7038 ist mit dem Bedienungsanzeigeabschnitt 7040 verbunden, welcher wiederum Bedienungstasten aufweist, um verschiedene Befehle von einem Operator an die Einrichtung abzugeben, beispielsweise eine Übersetzungs-Befehlstaste und eine Cursor-Taste, sowie ein Display oder eine Anzeige aufweist, welche visuell den eingegebenen japanischen Satztext, den japanischen Satz als Ergebnis einer Übersetzung, Zwischendaten wie eine Wörterbuchinformation verschiedene Informationen an den Operator anzeigt.In the word dictionary 7018 , in this embodiment, the dictionary data for English and Japanese words are stored. A vocabulary is defined, as well as a connecting relationship, ie a co-existing relationship or various information, such as meanings of a singular or plural form, a speech part, etc. Further, rule data for morphological and syntactic analysis is stored in rules file 7036 . The control section 7038 is connected to the operation display section 7040 , which in turn has operation buttons for issuing various commands from an operator to the device, such as a translation command key and a cursor key, as well as a display or display visually representing the input Japanese Sentence text, the Japanese sentence as a result of a translation, intermediate data such as a dictionary information indicating various information to the operator.

Hierbei können die meisten der Bedienungs-Anzeigefunktionen in einem Tastenfeld, wenn es an dem Eingabeabschnitt 7010 angeordnet ist, oder in einer Anzeige enthalten sein, wenn sie an dem Ausgabeabschnitt 7032 angeordnet ist.Here, most of the operation display functions may be included in a keyboard when located at the input section 7010 or in a display when disposed at the output section 7032 .

In Fig. 53 sind detaillierte Ausführungen für den eine morphologische Analyse durchführenden Abschnitt 7016 im Hinblick auf die Verarbeitung von Zahlen dargestellt. Diese Teile, die direkt zum Verständnis der Erfindung beitragen, werden dargestellt, obwohl der Abschnitt 7016 für die morphologische Analyse natürlich auch andere funktionelle Abschnitte für eine morphologische Analyse hat. Der Abschnitt 1017 hat einen Eingabeverarbeitungsabschnitt 7100 zum Aufnehmen und Verarbeiten der eingegebenen Zeichenfolgedaten, die von einem Vorredigierabschnitt 7014 eingegeben werden. Der Abschnitt 7100 ist mit einem Puffer versehen, in welchen Daten für die englische Zeichenfolge in Form von Kodedaten, wie ASCII eingegeben worden sind, und speichert vorübergehend die Daten für die Zeichenfolge.In Fig. 53, detailed explanations for the morphological analysis performing section 7016 are shown with respect to the processing of numbers. These parts, which directly contribute to the understanding of the invention, are presented, although, of course, the morphological analysis section 7016 also has other functional sections for morphological analysis. The section 1017 has an input processing section 7100 for receiving and processing the input string data input from a pre-editing section 7014 . The section 7100 is provided with a buffer into which data for the English character string has been input in the form of code data such as ASCII, and temporarily stores the data for the character string.

Die eingegebenen Zeichenfolgedaten, die vorübergehend in dem Verarbeitungsabschnitt 7100 gespeichert sind, werden an einen Abschnitt 7102 abgegeben, welcher die Daten in Wörterbuch- Bezugseinheiten, wie beispielsweise Worte, aufteilt. Der Abschnitt 7102 ist ein Funktionsabschnitt, welcher eine Wörterbuch-Bezugseinheit unterscheidet, die nacheinander die Zeichenfolge beim Abrufen des Wörterbuchs 7016 in dem Abschnitt 7106 darstellt. Wörterbuch-Bezugsabgrenzungen, die bei der Aufteilverarbeitung für die Wörterbuch-Bezugseinheit verwendet worden sind, werden an der Stelle eines englischen Zeichens, eines numerischen Zeichens, eines Apostrophs, Zeichen die kein Bindestrich oder Punkt sind, sowie eines Apostrophs angeordnet, welcher auf ein Leerzeichen folgt. Sie werden in einer Abgrenzungstabelle 7104 gespeichert und auf sie wird beim Aufteilen der Wörterbuch-Bezugseinheit in dem Abschnitt 7108 Bezug genommen.The input character string data temporarily stored in the processing section 7100 is delivered to a section 7102 which divides the data into dictionary reference units such as words. The section 7102 is a functional section that distinguishes a dictionary reference unit that successively displays the string in retrieving the dictionary 7016 in the section 7106 . Dictionary reference delineations used in the dictionary reference unit dividing processing are arranged in the place of an English character, a numeric character, an apostrophe, non-dash characters, and an apostrophe following a space , They are stored in a delimitation table 7104 and are referred to in dividing the dictionary reference unit in section 7108 .

Das Wörterbuch 7018 enthält Information insbesondere zum Abrufen der aufgeteilten Einheiten. Ferner speichert das Wörterbuch 7018 eine Morpheme verarbeitende Information, wie den Namen eines Monats, eines Wochentags, einer Kardinalzahl, welche nur einen Zahlenwert darstellt, einer Ordnungszahl, einer Einheit zum Ausdrücken von Gramm u. ä. Zeit, the, of, Komma (,) . (.) usw.Dictionary 7018 contains information particularly for retrieving the split units. Further, the dictionary 7018 stores morpheme processing information such as the name of a month, a day of the week, a cardinal number representing only a numerical value, an ordinal number, a unit for expressing grams, and the like. ä. time, the, of, comma (,). (.) etc.

Der Wörterbuch-Abrufabschnitt 7106 ist ein Funktionsabschnitt, welcher das Wörterbuch 7018 abruft, um die Wörterbuchinformation herauszunehmen, welche auf der Zeichenfolge basiert, welche von dem Abschnitt 7102 eingegeben worden ist, und überträgt dieselbe an einen eine Morpheme- Verarbeitungsinformation schaffenden Abschnitt 7108. Der Abschnitt 7108 enthält eine eine Morpheme-Verarbeitung schaffende Information (siehe Fig. 56), die anzeigt, daß eine Zeichenfolge mit einem morphemischen Merkmal eine Zeitbedeutung hat, wie Stunde, Jahr, Monat usw. wobei eine weitere spezifische Information in der erkannten Zeichenfolge vorgesehen ist, um eine Kardinalzahl oder eine zeitliche Bedeutung in dem Wörterbuch-Abrufabschnitt 7016 zu erhalten. Beispielsweise ist eine derartige Information geschaffen, die eine numerische Fig. d. h. eine Zahl "ja" bedeutet. Die Zeichenfolge, die mit einer Information in dem Abschnitt 7108 vorgesehen ist, wird ferner bei einer notwendigen lokalen grammatikalischen Zergliederung bzw. Analyse angewendet. The dictionary retrieval section 7106 is a function section that retrieves the dictionary 7018 to extract the dictionary information based on the character string input from the section 7102 , and transmits it to a morpheme processing information creating section 7108 . Section 7108 contains Morpheme processing information (see Figure 56) indicating that a string having a morphemic feature has a time meaning, such as hour, year, month, etc., providing further specific information in the recognized string is to obtain a cardinal number or a time meaning in the dictionary retrieval section 7016 . For example, such information is provided, which means a numerical figure, ie a number "yes". The string provided with information in section 7108 is also applied in necessary local grammatical analysis.

In diesem Fall wird eine Einheitengruppe der Wörterbuch- Bezugseinheit, wie beispielsweise ein Wort, das durch die Morpheme-Betätigungsinformation betätigt worden ist, mit Hilfe der lokalen grammatikalischen Zergliederungs- bzw. Parsing-Regeln kollektiv zusammengesetzt. Beispielsweise "Monatsname", "numerischer Ausdruck" werden in "Monatsname + numerischer Ausdruck" zusammengesetzt, d. h. "Okt." und "18" werden gruppiert zu "Okt. 18". Außerdem wird auch eine kollektive Zusammenstellung gemacht, wie beispielsweise "November the 2nd" für "Monatsname + the + numerischer Ausdruck", "22 March" für "numerischen Ausdruck + Monatsname", "the 23rd May" für "the + numerischer Ausdruck + Monatsname", "the 11th of June" für "the + numerischer Ausdruck + of +Monatsname", "′86, Jan. 27. Mon." für "Jahr +, plus Monat und Tag +,+ Wochentag", Sunday 26, Jan., 1986" für Wochentag +,+ Monat und Tag +,+ Jahr", "11 : 30 a. m." für "Zahl : Zahl + a. m. (oder p. m.)" oder "Monatsname + Jahr" "Monatsname + of + Jahr" usw.In this case, a unit group of the dictionary Reference unit, such as a word, which by the Morpheme actuation information has been actuated with Help the local grammatical dissection or Parsing rules are collectively composed. For example "Month name", "numeric expression" are displayed in "month name + Numeric Expression ", i.e." Oct. "and "18" are grouped to "Oct. 18". There will also be one made collective compilation, such as "November the 2nd" for "month name + the + numeric Expression "," 22 March "for" numeric expression + month name ", "the 23rd May" for "the + numeric expression + Month name "," the 11th of June "for" the + numeric expression + of + month name "," '86, Jan. 27th month "for" year +, plus month and day +, + weekday ", Sunday 26, Jan., 1986 "for weekday +, + month and day +, + year", "11: 30 a.m." for "number: number + a m (or p m)" or "Month name + year" "month name + of + year" etc.

Die Verarbeitung für die lokale Analyse wird in einem den Anfangswert einstellenden Abschnitt 7110, einem Anpassungsabrufabschnitt 7112, einem Einheiten-Aufteilabschnitt 7114, einem eine Morpheme-Verarbeitungsinformation schaffenden Abschnitt 7111, Abfrageabschnitten 7116 und 7120 und Verarbeitungsabschnitten 7122 und 7124 sowie einer Anpassungstabelle 7128 durchgeführt, die eine Morpheme-Verarbeitungs- Anzeigetabelle enthält, welches eine Unterscheidungs-Bezugstabelle ist, um zu unterscheiden, daß eine Einheitenfolge, die eine Zahl und Zeitfaktoren aufzeigt, wie in Fig. 57 dargestellt, eine zusammengesetzte Einheit aus Zeitfaktoren nach bestimmten Regeln ist. Der Abschnitt 7110 setzt den Anfangswert eines Zählers auf n, welcher die Anzahl von Wörterbuch- Bezugseinheiten zählt, um beim Abruf eine Folge von Wörterbuch- Bezugseinheiten als eine oben beschriebene Einheitengruppe in dem Anpassungs-Abfrageabschnitt 7112 anzupassen. The local analysis processing is performed in an initial value setting section 7110 , an adaptation retrieval section 7112 , a unit dividing section 7114 , a morpheme processing information creating section 7111 , interrogation sections 7116 and 7120, and processing sections 7122 and 7124, and a fitting table 7128 includes a morpheme processing display table which is a discrimination reference table for discriminating that a unit sequence indicating a number and time factors as shown in Fig. 57 is a composite unit of time factors according to specific rules. The section 7110 sets the initial value of a counter n, which counts the number of dictionary reference units, to match, on retrieval, a sequence of dictionary reference units as a unit group described above in the adaptation query section 7112 .

Der Abschnitt 7112 ruft die Anpassungstabelle 7128 für jede der Wörterbuch-Bezugseinheiten auf, um eine Anpassung durchzuführen. Der Einheiten-Aufteilabschnitt 7114 unterscheidet die Wörterbuch-Bezugseinheiten, die als "p" angenommen sind, was mit dem Wörterbuch-Abfragen in dem Abschnitt 7106 von den Wörterbuch-Bezugseinheiten aus vervollständigt worden ist, welche Zeichenfolgen darstellen, nachdem die Wörterbuch- Bezugseinheiten mit Hilfe des Wörterbuch-Abfragens durch den Zähler n vervollständigt worden sind.Section 7112 calls the customization table 7128 for each of the dictionary reference units to make an adjustment. The unit partitioning section 7114 discriminates the dictionary reference units assumed to be "p", which has been completed with the dictionary queries in the section 7106 from the dictionary reference units, which represent character strings after the dictionary reference units of the dictionary query by the counter n have been completed.

Der Abfrageabschnitt 7116 ist ein Funktionsabschnitt, welcher eine ähnliche Funktion wie der Wörterbuch-Abrufabschnitt 6106 hat, welcher das Wörterbuch 7018 abruft, um die Wörterbuchinformation herauszunehmen, welche auf der Zeichenfolge beruht, welche in dem Abschnitt 7114 unterschieden worden ist, und dieselbe an den Morpheme-Verarbeitungsinformation schaffenden Abschnitt 7118 überträgt. Der Abschnitt 7118 hat dieselbe Funktion wie der Abschnitt 7108, wobei eine weitere spezifische Information bei jeder Information vorgesehen wird, welche als eine Ordnungszahl oder ein Zeitfaktor in dem Abfrageabschnitt 7116 erkannt worden ist.The query section 7116 is a function section having a function similar to the dictionary retrieval section 6106 which retrieves the dictionary 7018 to take out the dictionary information based on the character string discriminated in the section 7114 and the same at the morpheme Processing information creating section 7118 transmits. The section 7118 has the same function as the section 7108 , with further specific information provided on each information which has been recognized as an ordinal number or a time factor in the polling section 7116 .

Der Abfrageabschnitt 7120 und die Verarbeitungsabschnitte 7122 und 7124 setzen gemeinsam eine Folge von Wörterbuch- Bezugseinheiten bis zu "p +n", was von dem Anpassungsabrufabschnitt 7112 durch die Verarbeitung in dem Abschnitt 7118 erhalten worden ist, zu einer Wörterbuch-Bezugseinheit zusammen. Dann wird das Ergebnis in der Tabelle 7126 gespeichert, welches ein Puffer zum Speichern der Wörterbuch- Information ist, welche durch das Abrufen vervollständigt worden ist. Das Ergebnis der morphologischen Analyse wird von der Tabelle 7126 an den Abschnitt I 7020 für syntaktische Analyse übertragen.The query section 7120 and the processing sections 7122 and 7124 collectively synthesize a sequence of dictionary reference units up to " p + n " obtained from the adaptation retrieving section 7112 by the processing in the section 7118 into a dictionary reference unit. Then, the result is stored in the table 7126 , which is a buffer for storing the dictionary information completed by the fetching. The result of the morphological analysis is transferred from the table 7126 to the section I 7020 for syntactic analysis.

Nunmehr wird das kollektive Zusammensetzen mittels der Morpheme- Verarbeitungsinformation anhand den in Fig. 55A und 55B dargestellten Flußdiagrammen erläutert. Beispielsweise soll die folgende Zeichenfolge in den Eingabeverarbeitungsabschnitt 7100 eingegeben sein (7300).
Eingegebene Zeichenfolge: ". . 26 Jan.,′80 he . ."
Der Abschnitt 7102 teilt die eingegebene Zeichenfolge durch die Wörterbuch-Bezugseinheit für ein Abrufen des Wörterbuchs 7018 auf (7302) "26" in der eingegebenen Zeichenfolge wird durch das Abtrennen der Wörterbuch-Bezugseinheit als Einheit abgetrennt. Es wird nun beurteilt, ob der abgetrennte Teil der Bezugseinheit für die eingegebene Zeichenfolge beendet worden ist oder nicht; wenn sie beendet ist, wird die Operation beendet (7304), während, wenn sie es noch nicht ist, wird im Flußdiagramm auf den folgenden Schritt 7306 vorgerückt.Now, the collective composition by means of the morpheme processing information will be explained with reference to the flowcharts shown in Figs. 55A and 55B. For example, the following character string should be input to the input processing section 7100 ( 7300 ).
Entered string: "26 Jan, '80 he."
The section 7102 divides the input character string by the dictionary reference unit for retrieving the dictionary 7018. ( 7302 ) "26" in the input character string is separated by separating the dictionary reference unit as a unit. It is now judged whether the separated part of the reference unit for the input character string has been finished or not; if it is finished, the operation is ended ( 7304 ), while if it is not, the flowchart advances to the following step 7306 .

Das Wörterbuch 7018 wird für "26" in der eingegebenen Zeichenfolge abgerufen, um die Wörterbuch-Information herauszunehmen, die anzeigt, daß "26" eine "Zahl" ist (7306). Dann zeigt eine Morphem-Verarbeitungsinformation an, daß "Zahl, Zahl" ein morphemes Merkmal ist, d. h. es ist eine Folge von Zahlen und wird als eine gruppierte Kardinalzahl behandelt (7308). Es wird nun beurteilt, ob die Gruppe, welche die Wörterbuch-Information erhält, beim Schritt 7308 mit der Morphem-Verarbeitungsinformation versehen worden ist oder nicht (7130). Wenn sie versehen worden ist, wird im Flußdiagramm auf den Schritt 7314 weitergegangen, um eine weitere Verarbeitung auf der Basis der lokalen Analyseregel anzuwenden, wobei keine vorgesehene Gruppe in der Tabelle 7126 aufgezeichnet wird (7312); im Flußdiagramm wird dann auf den Schritt 7302 zurückgekehrt. Folglich wird beim Schritt 7314 auf "26" vorgerückt, da diese Zahl mit der Morphem- Verarbeitungsinformation versehen ist. Die Verarbeitung beim Schritt 7314 wird entsprechend der Arbeitsweise des in Fig. 55B dargestellten Flußdiagramms durchgeführt.The dictionary 7018 is fetched for "26" in the input character string to take out the dictionary information indicating that "26" is a "number" ( 7306 ). Then, morpheme processing information indicates that "number, number" is a morpheme feature, ie it is a sequence of numbers and is treated as a grouped cardinal number ( 7308 ). It is now judged whether or not the group receiving the dictionary information has been provided with the morpheme processing information at step 7308 ( 7130 ). If provided, the flowchart proceeds to step 7314 to apply further processing based on the local analysis rule, and no designated group is recorded in table 7126 ( 7312 ); in the flow chart, the process returns to step 7302 . Consequently, at step 7314, it advances to "26" since this number is provided with the morpheme processing information. The processing in step 7314 is performed according to the operation of the flowchart shown in Fig. 55B.

Zuerst wird ein Anfangswert "0" in einem Zähler n gesetzt, welcher die Anzahl Anpassungs-Bezugseinheiten zählt, wenn die Bezugseinheiten in dem Anpassungs-Abrufabschnitt 7112 abgerufen werden (7410). Da ferner die Bezugseinheit, die durch das Wörterbuch-Abrufen im Abschnitt 7010 vervollständigt worden ist, als "p" gesetzt wird, wird die Anpassungstabelle 7128 für die p + n_te (n=0) Bezugseinheit, d. h. "26", durch den Abschnitt 7112 abgerufen (7412). Da "26" mit Hilfe der Morphem-Verarbeitungsinformation versehen worden ist, die beim Schritt 7308 anzeigt, daß es eine Ordnungszahl ist, und da diese Zusammenstellungen "jede mit einer Ordnungszahl" an deren oberen Ende an und nach dem zweiten Ausdruck die Zusammenstellung in der Anpassungstabelle 7128 vorhanden sind (siehe Fig. 57), ist die Bezugseinheit "26" gleich der Information der Anpassungstabelle 7128 und folglich bezüglich dieser angepaßt. In diesem Fall wird eine Anpassung für Ms-Me durchgeführt, während die zweite angepaßte Zusammenstellung als "Ms" gesetzt wird, wobei die letzten Daten der Kombination die "Ordnungszahl" an deren oberen Stelle als "me" in der Anpassungstabelle 7128 haben.First, an initial value "0" is set in a counter n which counts the number of matching reference units when the reference units in the adjustment retrieving section 7112 are retrieved ( 7410 ). Further, since the reference unit completed by the dictionary retrieval in the section 7010 is set as "p", the matching table 7128 for the p + n _th (n = 0) reference unit, ie, "26", becomes the section 7112 retrieved ( 7412 ). Since "26" has been provided by means of the morpheme processing information indicating at step 7308 that it is an ordinal number, and since these compilations have "each with an ordinal number" at its upper end and after the second term the compilation in the Matching table 7128 (see FIG. 57), the reference unit "26" is the same as the information of the fitting table 7128 and thus adapted to it. In this case, an adjustment is made for Ms-Me while the second matched set is set as "Ms", with the last data of the combination having the "ordinal" at its top than "me" in the lookup table 7128 .

Basierend auf dem Anpassungsergebnis in der Tabelle 7128 bei einer (p + n_te (n=0) Wörterbuch-Bezugseinheit wird der Anpassungszustand beurteilt (7414) und wenn festgestellt wird, daß er angepaßt ist, wird im Flußdiagramm auf den Schritt 7416 vorgerückt, während, wenn festgestellt wird, daß er nicht angepaßt ist, wird im Flußdiagramm auf den Schritt 7424 vorgerückt.Based on the fit result in the table 7128 at a (p + n _te (n = 0) dictionary reference unit, the match state is judged ( 7414 ) and if it is determined to be matched, the flowchart advances to step 7416 while if it is determined that it is not matched, the flowchart advances to step 7424 .

Wenn beurteilt wird, daß anzupassen ist, wird "1" in dem Zähler n gesetzt, um die Abtrennung für die Wörterbuch- Abtrennungseinheit bei p + 1_te (n=1) in der eingegebenen Zeichenfolge durchzuführen. Die Abtrennung wird auf dieselbe Weise wie beim Schritt 7302 durchgeführt. Das Wörterbuch 7018 wird für "Jan.," in der eingegebenen Zeichenfolge abgerufen, da dies als die Wörterbuch-Bezugseinheit, die "26" am nächsten ist, für die Verarbeitung abgeschieden worden ist, um die Morphem-Verarbeitungsinformation zu schaffen (7420, 7422). Diese Verarbeitung wird in derselben Weise wie bei den Schritten 7306 und 7308 durchgeführt.If it is judged that to be adjusted, "1" is set in the counter n to perform the separation for the dictionary separation unit at p + 1 _te (n = 1) in the input character string. The separation is performed in the same manner as in step 7302 . The dictionary 7018 is retrieved for "Jan.," in the input string, since this has been deposited as the dictionary reference unit closest to "26" for processing to provide the morpheme processing information ( 7420, 7422) ). This processing is performed in the same manner as steps 7306 and 7308 .

Durch Wiederholen der Vorgänge vom Schritt 7412 bis zum Schritt 7422 wird das Ablaufdiagramm bis "26. Jan.,′80 he" geschlossen durchgeführt. Da jedoch "he" beim Schritt 4712 nicht zu der Anpassung bezüglich der Anpassungstabelle 7128 paßt, wird das Ablaufdiagramm beim Schritt 7415 auf den Schritt 7424 vorgerückt. Dies bedeutet, daß obwohl die Daten bis 26 Jan.,′80" mit "Kardinalzahl, Monat, Jahr" in der Anpassungstabelle 7128 angepaßt sind, sie nicht für "26, Jan.,′80 he" angepaßt sind.By repeating the operations from step 7412 to step 7422 , the flowchart is executed until "Jan 26, '80 he" closed. However, since "he" does not match the adjustment table 7128 at step 4712 , the flowchart advances to step 7424 at step 7415 . This means that although the data are matched by 26th Jan, '80 'with' cardinal number, month, year 'in the match table 7128 , they are not matched for '26, Jan., '80'.

Wenn ferner die Sentenz an der eingegebenen Zeichenfolge beispielsweise "26. Jan.,′80′ "beendet ist, d. h. keine nächste Ausscheidung für die nächste Wörterbuch-Bezugseinheit vorhanden ist, wird im Flußdiagramm beim Schritt 7418 auf den Schritt 7424 vorgerückt. Für den Fall, daß festgestellt wird, daß beim Schritt 7414 nicht angepaßt ist, wird nunmehr beurteilt, ob der Inhalt des Zählers n mehr als 1 ist oder nicht (7424), und wenn er nicht mehr als 1 ist, wird dies als eine einzige Bezugseinheit in der Tabelle 7126 aufgezeichnet (7434).Further, if the sentence is terminated on the input character string, for example, "Jan. 26, '80 '", ie, there is no next discard for the next dictionary reference unit, the flow advances to step 7424 in step 7418 . In the event that it is determined that is not matched at step 7414 , it is now judged whether or not the content of the counter n is more than 1 ( 7424 ), and if it is not more than 1, this is considered a single one Reference unit recorded in table 7126 ( 7434 ).

Wenn er nicht weniger als 1 ist, wird eine Anpassung durchgeführt, indem das p + n(n=3) genommen wird, d. h. "he" in "6 Jan.,′80 he" ist "EOS", wodurch das Ende der Zusammenstellung angezeigt wird (7426, 7428). Wenn es nicht angepaßt ist, wird im Flußdiagramm auf den Schritt 7434 vorgerückt. Wenn es angepaßt ist, wird "26 Jan., ′80", was p-(p + n-1) für die Wörterbuch-Bezugseinheit ist, kollektiv entsprechend dem Zusammenstellungsergebnis zusammengestellt, was der Zusammenstellung Ms in der Anpassungstabelle 7128 entspricht, und das Ergebnis wird in der Tabelle 7126 aufgezeichnet (7430). Dann wird beachtet, daß die Bezugseinheiten bei der (p + n-1) ten Einheit beendet worden sind und es wird (p + n -1) auf "p" rückgesetzt (7432).If it is not less than 1, an adjustment is made by taking the p + n (n = 3), ie, "he" in "6 Jan," 80 he "is" EOS ", thus ending the compilation is displayed ( 7426, 7428 ). If it is not matched, the flowchart advances to step 7434 . If it is matched, "26 Jan., '80", which is p- (p + n-1) for the dictionary reference unit, is collectively assembled according to the compilation result, which corresponds to the compilation Ms in the fitting table 7128 , and Result is recorded in table 7126 ( 7430 ). It is then noted that the reference units have been terminated at the (p + n-1) th unit and (p + n -1) is reset to "p" ( 7432 ).

Nunmehr wird die achte Ausführungsform erläutert. In Fig. 63 ist der gesamte Aufbau der achten Ausführungsform eines Sprachanalysators mit Merkmalen nach der Erfindung dargestellt, welcher bei einer automatischen Übersetzungseinrichtung für Englisch/Japanisch angewendet wird.Now, the eighth embodiment will be explained. Fig. 63 shows the entire structure of the eighth embodiment of a speech analyzer having features of the invention applied to an English / Japanese automatic translator.

In Fig. 63 sind dargestellt, ein Eingabeabschnitt 8801, ein englischer Text 8002, ein Vorredigierabschnitt 8003, ein Abschnitt 8004 für eine morphologische Analyse, ein Abschnitt I 8005 für eine syntaktische Analyse, ein Abschnitt Ii 8006 für eine syntaktische Satzanalyse, ein Bedienungs- Anzeigeabschnitt 8007, ein Wort-Wörterbuch 8008, eine Regeldatei 8009, ein Steuerabschnitt 8010, ein Struktur- Umwandlungsabschnitt 8011, ein Abschnitt 8012 zum Erzeugen eines übersetzten Satzes, ein Nachredigierabschnitt 8013, ein Ausgabeabschnitt 8014 und ein japanischer Satz 8015. Diese Übersetzungseinrichtung hat, wie in Fig. 63 dargestellt, den Eingabeabschnitt 8001, über welche der englische Text 8002, der ins Japanische zu übersetzen ist, eingegeben wird. Der Eingabeabschnitt 8001 kann beispielsweise ein Tastenfeld mit Zeichentasten, wie alphanumerischen und Funktions-Tasten, eine optische Zeichenleseeinrichtung (OCR) zum Lesen des englischen Textes und/oder eine Dateispeichereinrichtung zum Lesen des englischen Textes aufweisen, welcher in einem Speichermedium, wie einer Magnetplatte aufgezeichnet ist.In Fig. 63, there are shown an input section 8801 , an English text 8002 , a pre-editing section 8003 , a morphological analysis section 8004 , a syntactic analysis section I 8005 , a syntactic sentence analysis section Ii 8006 , an operation display section 8007 , a word dictionary 8008 , a rule file 8009 , a control section 8010 , a texture conversion section 8011 , a translated sentence generation section 8012 , a post-editing section 8013 , an output section 8014, and a Japanese sentence 8015 . This translator, as shown in Fig. 63, has the input section 8001 through which the English text 8002 to be translated into Japanese is input. The input section 8001 may include , for example, a keypad with character keys such as alphanumeric and function keys, an optical character reader (OCR) for reading the English text and / or a file memory means for reading the English text recorded in a storage medium such as a magnetic disk ,

Der englische in den Abschnitt 8001 eingegebene Text wird in den Vorredigierabschnitt 8003 eingelesen, wobei eine Vorbehandlung für die Übersetzung durchgeführt wird. In diesem Fall werden hauptsächlich eine Satzerkennung und eine Verarbeitung von unbekannten Worten durchgeführt. Dies fungiert als ein Teil der morphologischen Analyse.The English text entered in the section 8001 is read into the pre-editing section 8003 , with pretreatment for the translation being performed. In this case, mainly sentence recognition and unknown word processing are performed. This acts as part of the morphological analysis.

Die vorredigierten, englischen Daten werden zusammen mit der bei der Vorredigierung erhaltenen Information an den Abschnitt 8004 abgegeben. Der Abschnitt 8004 unterteilt die Daten, indem er auf das Wort-Wörterbuch 8008 Bezug nimmt, zerlegt die englische Morpheme, führt verschiedene Arten von Zusammenlegungen durch, wie ein Verarbeiten von unbekannten Worten, ein Ausdrücken eines Eigennamens, Zeit und Zahlen, und führt eine Verarbeitung für den gesamten Satz durch, wie eine Zusatzfrage und eine Appositions-Erkennung. Die Regeln für eine morphologische Analyse sind in der Regeldatei 8009 enthalten.The pre-processed English data will be provided to section 8004 together with the information received during pre-editing. The section 8004 divides the data by referring to the word dictionary 8008 , decomposes the English morphemes, performs various types of merges such as processing unknown words, expressing a proper name, time and numbers, and performs processing for the entire sentence through, such as a supplemental question and an apposition recognition. The rules for a morphological analysis are included in rule file 8009 .

Die englischen Daten werden, nachdem sie der morphologischen Analyse unterzogen worden sind, zusammen mit der Wörterbuchinformation, welche durch die morphologische Analyse erhalten worden ist, an den Abschnitt I 8005 abgegeben. Der Abschnitt 1 8005 ist ein Funktionsabschnitt, welcher die Oberflächenschicht-Struktur für die Sätze zerlegt, indem eine grammatikalische Regel bei den englischen Daten angewendet wird, und dieser findet alle strukturellen Möglichkeiten.The English data, after being subjected to the morphological analysis, is delivered to the section I 8005 together with the dictionary information obtained by the morphological analysis. Section 1 8005 is a functional section that decomposes the surface-layer structure for the sentences by applying a grammatical rule to the English data, and this finds all the structural possibilities.

Die englischen Daten werden, nachdem sie der syntaktischen Analyse in dem Abschnitt I 8005 unterzogen worden sind, zusammen mit der entsprechenden Information an den Abschnitt II 8006 abgegeben, wo eine Lösung aus dem Ergebnis der syntaktischen Analyse im Hinblick auf die Oberflächenschicht durch die syntaktische Analyse I ausgewählt wird, indem eine strukturelle Beschreibung angewendet wird. Folglich wird ein plausibler sogenannter Parsing- oder Analysebaum der englischen Beschreibung vorbereitet, um dessen Struktur auszubilden. Diese sogenannten Parsing- oder Zergliederungsregeln sind auch in der Regeldatei 8009 enthalten.The English data, after being subjected to the syntactic analysis in the section I 8005 , is given to the section II 8006 together with the corresponding information, where a solution from the syntactic analysis result with respect to the surface layer by the syntactic analysis I is selected by applying a structural description. Consequently, a plausible so-called parsing or analysis tree of English description is prepared to form its structure. These so-called parsing or parsing rules are also included in rule file 8009 .

Die englischen Daten, welche der syntaktischen Analyse unterzogen worden sind, werden als die Daten für den entsprechenden Analyse-Baum an einen Struktur-Umwandlungsabschnitt 8011 übertragen. Der Abschnitt 8011 bereitet einen entsprechenden japanischen Strukturbaum aus dem Strukturbaum vor, welcher eine Zwischenstruktur des englischen Satzes ist, um ihn in eine dem japanischen unterliegende Struktur umzuwandeln, aus welcher der japanische Satz dann leicht übersetzt werden kann.The English data which has been subjected to the syntactic analysis is transmitted as the data for the corresponding analysis tree to a texture conversion section 8011 . The section 8011 prepares a corresponding Japanese structure tree from the structure tree, which is an intermediate structure of the English sentence, to convert it into a Japanese underlying structure, from which the Japanese sentence can then be easily translated.

Die Daten für den Strukturbaum, welcher die dem Japanischen zugrundeliegende Struktur anzeigt, die auf diese Weise umgeformt worden ist, werden an einen Übersetzungsabschnitt 8012 abgegeben, in welchem der übersetzte Satz erzeugt wird. Dies ist ein Funktionsabschnitt, um einen japanischen Satz aus der Struktur des dem japanischen Satz entsprechenden Strukturbaums zu erzeugen.The data for the structure tree indicating the Japanese underlying structure that has been thus reformed is delivered to a translation section 8012 in which the translated sentence is generated. This is a function section for generating a Japanese sentence from the structure of the structure tree corresponding to the Japanese sentence.

Die auf diese Weise übersetzten japanischen Satzdaten, d. h. die Daten für den japanischen Satz, werden an den Vorredigierabschnitt 8013 abgegeben, welcher die übersetzten Daten zum Abrufen des Speichers 8008 modifiziert, wobei die bei der Übersetzung benutzte Information dazu verwendet wird, einen natürlicheren japanischen Satz zu vervollkommnen. Die Daten für den japanischen Satz werden an den Ausgabeabschnitt 8014 übertragen und als der übersetzte japanische Satz 8015 von dem Ausgabeabschnitt 8014 abgegeben. Der Ausgabeabschnitt 8014 weist einen Drucker, ein Display und/oder eine Dateispeichereinrichtung, wie eine Magnetplatte auf.The Japanese sentence data thus translated, that is, the data for the Japanese sentence, are delivered to the pre-editing section 8013 , which modifies the translated data for fetching the memory 8008 using the information used in the translation to obtain a more natural Japanese sentence perfect. The data for the Japanese sentence is transferred to the output section 8014 and output as the translated Japanese sentence 8015 from the output section 8014 . The output section 8014 includes a printer, a display, and / or a file storage device, such as a magnetic disk.

Der Fluß einer Reihe von Übersetzungsvorgängen wird durch den Steuerabschnitt 8010 gesteuert, welcher die Steuerung für die gesamte Einrichtung regelt. Das Wort-Wörterbuch 8008 enthält im Falle der dargestellten Ausführungsform Wörterbuchdaten für englische und japanische Wörter, wobei außerdem dem Vokabular verschiedene Informationen, wie eine verbindende Beziehung, d. h. eine gleichzeitig bestehende Beziehung, Bedeutungen, Singular- und Pluralformen einen Sprachteil usw. zugeordnet werden. Ferner enthält die Regeldatei 8009 Regeldaten für die morphologische Analyse und für eine englische Satzzergliederung. The flow of a series of translation operations is controlled by the control section 8010 , which controls the control for the entire device. The word dictionary 8008 includes dictionary data for English and Japanese words in the case of the illustrated embodiment, and also assigns a language part, etc., to the vocabulary, such as a connecting relationship, ie, a co-existing relationship, meanings, singular and plural forms. The rule file 8009 also contains rule data for the morphological analysis and for an English sentence decomposition.

Der Steuerabschnitt 8010 ist mit dem Bedienungsanzeigeabschnitt 8007 verbunden, welcher Operationstasten, um verschiedene Befehle von einem Operator an die Einrichtung abzugeben, so beispielsweise eine Übersetzungs-Befehlstaste, eine Cursortaste, usw. ein Display oder eine Anzeige, welche visuell den eingegebenen englischen Satztext, den japanischen Satz als Ergebnis der Übersetzung, Zwischendaten, wie eine Wörterbuch-Information und verschiedene Befehle an den Operator anzeigt. Sie kann so ausgelegt werden, daß die meisten dieser Bedienungsanzeigefunktionen in dem Tastenfeld vorhanden sind, falls es in dem Eingabeabschnitt 8001 vorgesehen ist oder an der Anzeige vorhanden ist, falls diese in dem Ausgabeabschnitt 8014 vorgesehen ist.The control section 8010 is connected to the operation display section 8007 which has operation keys for giving various commands from an operator to the device, such as a translation command key, a cursor key, etc. a display or display visually representing the input English sentence text Japanese sentence as a result of translation, intermediate data such as dictionary information and various commands to the operator. It can be designed so that most of these operation display functions are present in the keyboard if it is provided in the input section 8001 or present on the display if it is provided in the output section 8014 .

Die erläuterte Ausführungsform betrifft die automatische Übersetzungseinrichtung, und sie wird dazu verwendet, damit, wenn ein abgeleitetes Wort in dem englischen Text 8002 enthalten ist, ein grammatikalisches Merkmal, ein semantisches Merkmal, das (japanische) äquivalent, usw. in Abhängigkeit von den Bedingungen für das abgeleitete Wort bewertet werden, das beispielsweise durch eine Hinzufügung erkannt worden ist, um dadurch die Zuverlässigkeit für das erhaltene Zergliederungsergebnis oder das Übersetzungsergebnis zu erhöhen. Ein Affix-Wörterbuch wird für die Bewertung der Wörterbuchinformation bei unbekannten Wörtern in der morphologischen Analyse verwendet. Als Verarbeitungsmodes werden drei Arten von Verarbeitungen gefordert, d. h. eine Verarbeitung für die Vorsilbe bzw. das Präfix, ein Präfix für das Suffix bzw. die Nachsilbe und die bewertete Verarbeitung durch das Suffix. Jedoch enthalten die Datenarten im wesentlichen zwei Typen, d. h. Präfix- und Suffix-Bewertungsdaten.The illustrated embodiment relates to the automatic translation device, and it is used so that when a derived word is included in the English text 8002 , a grammatical feature, a semantic feature, the (Japanese) equivalent, etc., depending on the conditions for evaluating the derived word recognized by, for example, an addition, thereby increasing the reliability of the obtained decomposition result or the result of the translation. An Affix dictionary is used for the evaluation of dictionary information on unknown words in morphological analysis. The processing mode requires three kinds of processing, that is, prefix processing, prefix for the suffix, and the evaluated processing by the suffix. However, the data types essentially contain two types, ie, prefix and suffix rating data.

Als erstes werden die Präfix- und die Suffix-Bewertungsdaten im einzelnen erläutert.First, the prefix and suffix rating data become explained in detail.

(1) evaluation data prefix

Wenn der Präfix- bzw. Vorsilbenteil eines nicht im Wörterbuch registrierten Wortes mit dem nachstehend beschriebenen (Teil) übereinstimmt und der Restteil in dem Wörterbuch vorhanden ist, wird das Wort entsprechend den Wörterbuchdaten für dessen Wurzel behandelt. Bei den Wörterbuchdaten ist es möglich, ein internes Merkmal zu der Reihe des ursprünglichen internen Merkmals hinzuzufügen und ein japanisches Suffix dem ursprünglichen Übersetzungswort hinzuzufügen. Für das eingebene Wort "elektrochemisch" beispielsweise kann der Eingang "elektrochemisch" in der Wörterbuch-Informations- Konservierungstabelle entsprechend den Wörterbuchdaten durch den Präfixeintrag "elektro" und der Wörterbucheintrag "chemisch" gebildet werden, wie nachstehend gezeigt ist:If the prefix or part of a prefix is not in the dictionary registered word with the one described below (Part) matches and the remainder in the dictionary is present, the word becomes according to the dictionary data treated for its root. In the dictionary data is It is possible to add an internal feature to the series of the original one to add internal feature and a Japanese one Add suffix to the original translation word. For the word "electrochemical", for example the input "electrochemical" in the dictionary information Conservation table according to the dictionary data by the prefix entry "elektro" and the dictionary entry "chemically" formed as shown below is:

PräfixeintragPräfixeintrag WörterbucheintragDictionary entry elektroelectro chemischchemical eingegebenes Wortentered word Wörterbuch-Informations-Tabellen-EintragDictionary information table entry elektrochemischelectrochemically elektrochemischelectrochemically

Der Eintrag in der Tabelle übernimmt dann alle die Wörterbuch- Informationen, welchen die Wörterwurzeln entsprechen.The entry in the table then takes over all the dictionary Information to which the words correspond.

(2) suffix evaluation data

Wenn der Suffix- oder Nachsilbenteil eines nicht im Wörterbuch registrierten Wortes mit dem nachstehend beschriebenen (Teil) übereinstimmt und der restliche Teil in dem Wörterbuch vorhanden ist, wird dieses Wort registriert, indem neue Wörterbuchdaten entsprechend der Information geschaffen werden, die in dem Suffix-Wörterbuch beschrieben ist. In diesem Fall wird das erste (Japanische) Äquivalent in den Wörterbuchdaten, die dem Wurzelteil des Wortes entsprechen herausgenommen und für das (Japanische) Äquivalent in den neuen Wörterbuchdaten verwendet.If the suffix or suffix part of a not in the dictionary registered word with the one described below (Part) matches and the rest of the part in the dictionary is present, this word is registered by new ones Dictionary data are created according to the information which is described in the suffix dictionary. In this Case becomes the first (Japanese) equivalent in the dictionary data, which correspond to the root part of the word taken out and for the (Japanese) equivalent in the new dictionary data used.

Für das eingegebene Wort "controler" beispielsweise wird ein Eintrag in der Tabelle "controler" basierend auf dem Suffix-Eintrag "-(e)r" und dem Wörterbuch-Eintrag "control" registriert.For example, for the entered word "controler" an entry in the "controler" table based on the Suffix entry "- (e) r" and the dictionary entry "control" registered.

Suffixeintragsuffix entry WörterbucheintragDictionary entry (Verb)-(e)r(Verb) - (e) R controlcontrol Hauptwortnoun Verbverb Eingangswortinput word Wörterbuchinformations-TabelleneintragDictionary information table entry controlercontroler controlercontroler Hauptwortnoun

Nunmehr wird der Umriß für die Verarbeitung eines abgeleiteten und die Verarbeitung eines unbekannten Wortes erläutert.Now the outline for the processing of a derived and the processing of an unknown word explained.

(1) When for a word not registered in the dictionary a prefix or suffix at the beginning or end of a Word is included and the remaining part of the word in registered in the dictionary becomes an English part a language information, an internal feature and the Japanese Equivalent based on the dictionary and the Affix information composed.
(2) Prefix and suffix are listed as a movie system and can be edited independently of a program.
(3) First the possibility for the prefix is tried and if that is not possible, the possibility for the suffix tries. If both are included, will no attempt was made.
(4) For the word that failed in the trial evaluation is processing for the end part as a unknown word performed.

Nunmehr wird insbesondere eine achte Ausführungsform anhand eines Blockdiagramms in Fig. 58 erläutert. In Fig. 58 sind dargestellt, ein Eingabeverarbeitungsabschnitt 8020, ein Eingabe-Aufteilabschnitt 8021, eine Abgrenzungstabelle 8022, ein Wörterbuch-Abrufabschnitt 8026, ein Ableitungs-Verarbeitungsabschnitt 8023, ein Bezugswörterbuch 8024 und eine Konservierungstabelle 8025 für eine Wörterbuch-Information. Zuerst wird ein englischer Satz in den Eingabeverarbeitungsabschnitt von einer Eingabeeinrichtung aus eingeschrieben, welche eine Vorlageneingabedatei oder eine Tastatur, ein ORC usw. aufweist. Dann wird die Wörterbuch-Bezugseinheit in dem hierfür vorgesehenen Abschnitt aufgeteilt, wobei auf die Abgrenzungstabelle Bezug genommen wird und wenn dies nicht zu beenden ist, wird ein Wörterbuch-Versuch mit Hilfe des Bezugswörterbuchs durchgeführt. Wenn es als Ergebnis des Versuchs einen Eintrag gibt, wird das Versuchsergebnis in der Konservierungstabelle für Wörterbuch-Information aufgezeichnet, während eine Verarbeitung für ein abgeleitetes Wort durchgeführt wird, wenn es keinen Eintrag gibt.In particular, an eighth embodiment will be explained with reference to a block diagram in FIG. 58. In Fig. 58, an input processing section 8020 , an input partitioning section 8021 , a delimitation table 8022 , a dictionary retrieving section 8026 , a derivation processing section 8023 , a reference dictionary 8024, and a dictionary information preserving table 8025 are illustrated . First, an English sentence is written in the input processing section from an input device having an original input file or a keyboard, an ORC and so on. Then, the dictionary reference unit is divided into the dedicated section, referring to the delimitation table, and if not to be finished, a dictionary attempt is performed using the reference dictionary. If there is an entry as a result of the experiment, the trial result is recorded in the dictionary information preservation table while a derivative word processing is performed if there is no entry.

Fig. 59 ist ein Blockdiagramm zur Erläuterung einer Ausführungsform der Ableitungsverarbeitung mittels eines Präfix. In Fig. 49 sind dargestellt ein Anpassungsabschnitt 8030 zwischen dem oberen Teil und dem Präfix-Wörterbuch, ein Präfix-Wörterbuch 8031, ein Abschnitt 8030 für einen Wörterbuch-Abruf außer für den Präfixteil, ein Anpassungsabschnitt 8033 für den Sprachteil in dem Präfix-Wörterbuch und den Sprachteil des Eintrags, ein Wörterbuch-Informationsvorbereitungsabschnitt 8034 durch die Präfix-Bewertung und ein Ableitungs-Verarbeitungsabschnitt 8035 durch das Präfix. Fig. 59 is a block diagram for explaining an embodiment of the derivative processing by means of a prefix. In Fig. 49, there are shown an adaptation section 8030 between the upper part and the prefix dictionary, a prefix dictionary 8031 , a dictionary retrieval section 8030 except for the prefix part, a speech part adaptation section 8033 in the prefix dictionary and the dictionary part of the entry, a dictionary information preparation section 8034 by the prefix evaluation, and a derivation processing section 8035 by the prefix.

Fig. 60 ist ein Blockdiagramm zur Erläuterung einer Ausführungsform der Ableitungsverarbeitung mittels eines Suffix. In Fig. 60 sind dargestellt, ein Anpassungsabschnitt 8040 zwischen dem Endteil und dem Suffix-Wörterbuch, ein Suffix-Wörterbuch 8041, ein Verarbeitungsabschnitt 8042 zum Vervollständigen eines (ganz unbekannten) nichtregistrierten Wortes, ein Wörterbuchabrufabschnitt 8043 um das Wörterbuch bezüglich des Teils außer des Suffix-Teils abzurufen, einen Anpassungsabschnitt 8044 zwischen dem Sprachteil der Wurzel und dem Eintragssprachteil in dem Suffix, ein Wörterbuch-Information vorbereitender Verarbeitungsabschnitt 8045 mit einer Suffix-Bewertung, ein Wörterbuch- Abrufabschnitt 8046 zum Durchführen eines Wörterbuchabrufs durch Hinzufügen der Wurzeländerung in dem Suffix an dem Teil außer für den Suffix-Teil und eine ein nicht registriertes Wort verarbeitender Abschnitt 8074 mittels einer Suffix- Bewertung. Zuerst wird eine Anpassung zwischen dem Endteil und dem Suffix-Wörterbuch gemacht. Wenn es nicht angepaßt ist, wird das Wort als ein vollständig nichtregistriertes Wort verarbeitet, während, wenn es angepaßt ist, das Wörterbuch für den Teil außer für den Suffix- Teil abgerufen wird. Wenn als Ergebnis des Wörterbuch-Abrufens kein Eintrag vorhanden ist, wird ein Wörterbuch-Abrufen durchgeführt, während die Wurzeländerung in dem Suffix- Wörterbuch zu einem Teil außer für den Suffix-Teil hinzugefügt wird. Wenn im Ergebnis kein Eintrag vorhanden ist, wird eine Verarbeitung für ein nicht registriertes Wort mittels der Suffix-Bewertung durchgeführt. Während andererseits, wenn ein Eintrag vorhanden ist, eine Anpassung zwischen dem Sprachteil des Eintrags und dem Sprachteil der Wurzel in dem Suffix-Wörterbuch auf dieselbe Weise wie in dem Fall durchgeführt wird, daß der Eintrag als Ergebnis des Wörterbuch-Abrufens für den Teil außer für den Suffix-Teil vorhanden ist. Wenn es angepaßt ist, wird eine Wörterbuch- Informations-Vorbereitungsverarbeitung mittels einer Suffix- Bewertung durchgeführt, während, wenn es nicht angepaßt ist, eine Verarbeitung für ein nicht angepaßtes Wort mittels einer Suffix-Bewertung durchgeführt wird. Fig. 60 is a block diagram for explaining an embodiment of the derivation processing by means of a suffix. In Fig. 60, there are shown an adaptation section 8040 between the end part and the suffix dictionary, a suffix dictionary 8041 , a processing section 8042 for completing a (completely unknown) unregistered word, a dictionary prefetch section 8043 around the dictionary with respect to the part except the suffix Retrieve an adaptation section 8044 between the speech part of the root and the entry speech part in the suffix, a dictionary information preparing processing section 8045 with a suffix evaluation, a dictionary retrieval section 8046 for performing dictionary retrieval by adding the root change in the suffix to the dictionary Part except for the suffix part and unregistered word processing part 8074 by means of a suffix appraisal. First, an adjustment is made between the end part and the suffix dictionary. If it is not matched, the word is processed as a completely unregistered word, while if it is matched, the dictionary is retrieved for the part except for the suffix part. If there is no entry as a result of the dictionary retrieval, a dictionary retrieval is performed while adding the root change in the suffix dictionary to a part except for the suffix part. If, as a result, there is no entry, processing for an unregistered word is performed by means of the suffix evaluation. On the other hand, when there is an entry, matching between the speech part of the entry and the speech part of the root in the suffix dictionary is performed in the same way as in the case that the entry is the result of the dictionary retrieval for the part except for the suffix part is present. If it is matched, dictionary information preparation processing is performed by means of a suffix evaluation, whereas if it is not matched, unmatched word processing is performed by means of a suffix evaluation.

In Fig. 61 ist ein Blockdiagramm dargestellt, das Einzelheiten des gesamten Aufbaus zeigt, welcher durch ein Zusammenfügen der Teile der Fig. 58 bis 60 erhalten worden ist; Fig. 62 stellt die Einzelheiten für den in Fig. 61 dargestellten Verarbeitungsabschnitt 8052 für das vollständig nichtregistrierte Wort dargestellt. Der in Fig. 61 dargestellte Verarbeitungsabschnitt 8042 für das vollständig nicht registrierte Wort weist einen Verarbeitungsabschnitt 8050 für eine bewährte Informations- Vorbereitung für ein Hauptwort und einen entsprechenden Abschnitt 8051 für ein Verb auf. Da jedoch jeder der Teile in Fig. 61 und 62 bereits im einzelnen beschrieben worden ist, brauchen sie nicht noch einmal erläutert zu werden. Fig. 61 is a block diagram showing details of the entire structure obtained by assembling the parts of Figs. 58 to 60; Fig. 62 illustrates the details of the completely unregistered word processing section 8052 shown in Fig. 61. The completely unregistered word processing section 8042 shown in FIG. 61 has a tried-and-true information preparation processing section 8050 for a noun and a corresponding section 8051 for a verb. However, since each of the parts in Figs. 61 and 62 has already been described in detail, they need not be explained again.

Nunmehr wird eine neunte Ausführungsform mit Merkmalen nach der Erfindung beschrieben. Hierbei ist in Fig. 64 der gesamte Aufbau der neunten Ausführungsform dargestellt, welche bei einer automatischen Übersetzungseinrichtung für Englisch-Japanisch angewendet ist. Diese Ausführungsform hat einen Eingabeabschnitt 9010, über welche ein englischer Text 9012, welcher ins Japanische zu übersetzen ist, eingegeben wird. Der Eingabeabschnitt 9010 kann ein Tastenfeld mit Zeichentasten, wie alphanumerische oder Funktions-Tasten, eine optische Zeichenleseeinrichtung (ORC) zum Lesen des auf Papier aufgezeichneten englischen Textes und/oder einer Dateispeichereinrichtung zum Lesen des englischen Textes aufweisen, welcher auf ein Speichermedium, wie einer Magnetplatte, aufgezeichnet ist.A ninth embodiment with features according to the invention will now be described. Here, Fig. 64 shows the entire structure of the ninth embodiment applied to an English-Japanese automatic translation device. This embodiment has an input section 9010 through which an English text 9012 to be translated into Japanese is input. The input section 9010 may comprise a keypad with character keys such as alphanumeric or function keys, an optical character reader (ORC) for reading the English text recorded on paper, and / or a file memory means for reading the English text onto a storage medium such as a magnetic disk , is recorded.

Der von dem Abschnitt 9010 eingegebene, englische Text wird in einen Vorredigierabschnitt 9014 gelesen, in welchem die Übersetzungs-Vorbehandlung durchgeführt wird. In diesem Fall werden hauptsächlich eine Satzerkennung und eine Verarbeitung von unbekannten Worten durchgeführt. Dies fungiert dann als ein Teil einer morphologischen Analyse.The English text input from the section 9010 is read into a pre-editing section 9014 in which the translation pre-treatment is performed. In this case, mainly sentence recognition and unknown word processing are performed. This then acts as part of a morphological analysis.

Die vorredigierten englischen Daten werden zusammen mit der bei der Vorredigierung erhaltenen Information an einen Abschnitt 9016 für die morphologische Analyse übertragen. Der Abschnitt 9016 unterteilt den Satz für ein Abfragen eines Wort-Wörterbuchs 9018, durch Zergliedern der englischen Morpheme, führt verschiedene Anordnungen, wie eine Verarbeitung von unbekannten Wörtern, eine Eigennamen-Verarbeitung, einen Ausdruck für Zeit, Zahl, usw. und führt auch eine Verarbeitung für den gesamten Satz wie eine Zusatzfrage und eine Appositions-Erkennung durch. Die Regeln für eine morphologische Analyse sind in einer Regeldatei 9036 enthalten.The pre-processed English data, together with the information obtained in the pre-editing, is transferred to a section 9016 for morphological analysis. The section 9016 divides the phrase for retrieving a word dictionary 9018 by dissecting the English morpheme, performs various arrangements such as unknown word processing, proper name processing, expression for time, number, etc., and also performs a Processing for the entire sentence such as a supplemental question and apposition recognition by. The rules for a morphological analysis are contained in a rule file 9036 .

Die englischen Daten, welche der morphologischen Analyse unterzogen worden sind, werden zusammen mit der Wörterbuch- Information, welche aus der morphologischen Analyse erhalten worden ist, an einen Abschnitt I 9020 für syntaktische Analyse übertragen. Der Abschnitt I 9020 in dieser Ausführungsform ist ein Funktionsabschnitt, welcher die Oberflächenschicht-Struktur von unten nach oben und von rechts nach links für den Satz zergliedert, indem die CFG- Regel bei den englischen Daten angewendet wird, und dieser findet alle strukturellen Möglichkeiten.The English data which has been subjected to the morphological analysis, together with the dictionary information obtained from the morphological analysis, is transmitted to a section I 9020 for syntactic analysis. The section I 9020 in this embodiment is a functional section which dissects the surface layer structure from bottom to top and right to left for the sentence by applying the CFG rule to the English data, and this finds all structural possibilities.

Die englischen Daten werden, nachdem sie in dem Abschnitt I 9020 syntaktisch analysiert worden sind, zusammen mit der analysierten Information in einen Abschnitt II 9022 für syntaktische Analyse abgegeben. Der Abschnitt wählt eine Lösung für das Ergebnis der syntaktischen Analyse im Hinblick auf die Oberflächenschicht-Struktur in dem Abschnitt I durch Anwenden einer strukturellen Beschreibung aus. Auf diese Weise wird ein plausibler Parsing- bzw. Analysebaum für einen englischen Satz vorbereitet, um dessen Struktur zu bilden. Diese sogenannten Parsing-Regeln sind ebenfalls in der Regeldatei 9036 gespeichert. Die englischen Daten, die der syntaktischen Analyse unterworfen worden sind, werden als die Daten für den sogenannten Parsing-Baum an den Strukturumwandlungsabschnitt 9024 übertragen. In dem Abschnitt 9024 wird der sogenannte Parsing-Baum für den entsprechenden japanischen Satz für einen strukturellen Baum vorbereitet, welcher eine Zwischenstruktur eines englischen Satzes ist und in eine dem japanischen unterliegende Struktur umgeformt, aus welcher dann ein japanischer Satz leicht übersetzt werden kann. Die Daten für den strukturellen Baum, welcher die dem japanischen zugrundeliegende Struktur anzeigt, die auf diese Weise umgewandelt worden ist, werden an einen Übersetzungsabschnitt 9026 abgegeben, in welchem der übersetzte Satz erzeugt wird. Dies ist eine Funktion, um einen japanischen Satz aus einer Baumstruktur des japanischen Strukturbaums zu erzeugen. Zuerst wird eine Satzstruktur erzeugt, indem der Strukturbaum durch Austauschen der Reihenfolge geändert wird, um so mit der Struktur des Japanischen übereinzustimmen und dann wird eine Morphem-Erzeugung durchgeführt, um einen übersetzten Satz in der Form von oben nach unten und von links nach rechts in dem Satz-Strukturbaum zu erzeugen.The English data, after having been parsed in section I 9020 , is submitted, together with the analyzed information, to a section II 9022 for syntactic analysis. The section selects a solution to the result of the syntactic analysis with respect to the surface layer structure in the section I by applying a structural description. In this way, a plausible parsing tree for an English sentence is prepared to form its structure. These so-called parsing rules are also stored in the rule file 9036 . The English data that has been subjected to the syntactic analysis is transmitted as the data for the so-called parsing tree to the texture conversion section 9024 . In section 9024 , the so-called parsing tree for the corresponding Japanese sentence is prepared for a structural tree, which is an intermediate structure of an English sentence and transformed into a Japanese underlying structure from which a Japanese sentence can then be easily translated. The data for the structural tree indicating the Japanese underlying structure that has been converted in this manner are delivered to a translation section 9026 in which the translated sentence is generated. This is a function to create a Japanese sentence from a tree of the Japanese structure tree. First, a sentence structure is created by changing the structure tree by exchanging the order so as to conform to the structure of Japanese, and then morphemic generation is performed to produce a translated sentence in the top-down and left-to-right form to generate the sentence structure tree.

Die Daten für den auf diese Weise erzeugten japanischen Strukturbaum sind Übersetzungsdaten, die an einen Nachredigierabschnitt 9030 abgegeben werden, in welchem die Übersetzungsdaten durch Abrufen des Wörterbuchs 9018 mit Hilfe der Information modifiziert werden, welche bei der Übersetzungs- Verarbeitung benutzt worden sind, um einen natürlicheren japanischen Satz zu vervollständigen. Die Daten für den japanischen Satz werden an einen Ausgabeabschnitt 9032 übertragen und werden dann als der übersetzte japanische Satz 9034 von dem Ausgabeabschnitt 9032 aus abgegeben. Der Ausgabeabschnitt 9032 weist beispielsweise einen Drucker, ein Display und/oder eine Datei-Speichereinrichtung, wie eine Magnetplatte auf. Der Fluß einer Reihe von Übersetzungsvorgängen wird durch einen Steuerabschnitt 9038 gesteuert, welcher die Steuerung für die gesamte Einrichtung regelt. Die Wörterbuchdatei 9018 speichert Wörterbuchdaten für englische und japanische Worte und die Regeldatei 9036 speichert in dieser Ausführungsform Daten für die morphologische und die syntaktische Analyse.The data for the Japanese structure tree thus generated is translation data which is output to a post-editing section 9030 in which the translation data is modified by retrieving the dictionary 9018 using the information used in the translation processing to be more natural complete Japanese sentence. The data for the Japanese sentence are transferred to an output section 9032 and then output as the translated Japanese sentence 9034 from the output section 9032 made. The output section 9032 includes, for example, a printer, a display and / or a file storage device such as a magnetic disk. The flow of a series of translation operations is controlled by a control section 9038 which controls the control of the entire device. The dictionary file 9018 stores dictionary data for English and Japanese words, and the rule file 9036 stores data for morphological and syntactic analysis in this embodiment.

Der Steuerabschnitt 9038 ist mit einem Bedienungs-Anzeigeabschnitt 9040 verbunden, welcher Bedienungstasten, um verschiedene Befehle von einem Operator an die Einrichtung abzugeben, beispielsweise eine Übersetzungs-Befehlstaste oder eine Cursortaste, und ein Display oder eine Anzeigeeinrichtung auf, welche visuell einen eingegebenen englischen Text, einen japanischen Satz als das Übersetzungsergebnis, Zwischendaten, wie eine Wörterbuch-Information und verschiedene Befehle für den Operator anzeigt. Sie kann so ausgebildet sein, daß die meisten Bedienungs-Anzeigefunktionen in einem Tastenfeld enthalten sind, wenn es an dem Eingabeabschnitt 9010 angeordnet ist, oder bei einer Anzeige, wenn sie an dem Ausgabeabschnitt 9032 angeordnet ist. In dem Abschnitt I für eine syntaktische Analyse wird die cfg-Regel bei dem englischen Satz von oben nach unten und von rechts nach links für die englischen Daten nach einer morphologischen Analyse angewendet, um alle möglichen Lösungen für die Satzstruktur abzuleiten. Die Lösungen sind im allgemeinen in Form eines Strukturbaums zu verstehen. Dieser zeigt eine Beziehung für Worte oder Gruppen, die in jedem der Sätze enthalten sind die zueinander in einer untergeordneten oder einer gleichzeitig bestehenden Beziehung, wie einer modifizierenden Beziehung oder einer Fallbeziehung, einer untergeordneten Beziehung zueinander, wie beispielsweise zwischen Eltern, Kind und Enkel usw. in Beziehung stehen. Jedes der Worte oder der Gruppe befindet sich an dem Knoten des Strukturbaumes.The control section 9038 is connected to an operation display section 9040 which has operation buttons for issuing various commands from an operator to the device, for example, a translation command key or a cursor key, and a display or display which visually displays an input English text. display a Japanese sentence as the translation result, intermediate data such as dictionary information, and various commands for the operator. It may be configured so that most of the operation display functions are contained in a keyboard when it is placed on the input section 9010 or in a display when it is placed on the output section 9032 . In Syntactic Analysis Section I, the cfg rule is applied to the English sentence from top to bottom and from right to left for the English data after morphological analysis to derive all possible solutions to the sentence structure. The solutions are generally to be understood in the form of a tree structure. This shows a relationship for words or groups contained in each of the sentences which are related to each other in a subordinate or co-existing relationship, such as a modifying relationship or case relationship, a subordinate relationship, such as between parent, child and grandchild, and so on. in relationship. Each of the words or group is located at the node of the structure tree.

In dieser Ausführungsform werden von der syntaktischen Analyse Merkmale, die die Form beachten, und Vokabularien eines Satzes unterschieden, um die kollektive Zusammenstellung im Hinblick auf die Satzstruktur zu beurteilen. Die Zusammenstellung im Hinblick auf die Satzstruktur wird hier als "Einheit" und "Block" bezeichnet.In this embodiment, the syntactic analysis Features that honor the form and vocabularies a sentence distinguished to the collective compilation with regard to the sentence structure. The Compilation in terms of the sentence structure will be here referred to as "unit" and "block".

Die "Einheit" ist ein Wort oder eine Gruppe von Worten, welche die kleinste Einheit für den Übersetzungsvorgang bilden, welcher identisch mit einem Wort in der syntaktischen Analyse behandelt wird, und die Wörterbuch-Information für jedes der darin enthaltenen, gesetzmäßigen Elemente wird nicht verwendet.The "unity" is a word or group of words, which is the smallest unit for the translation process form, which is identical to a word in the syntactic Analysis is treated, and the dictionary information for each of the lawful elements contained therein is not used.

Ein "Block" ist eine strukturelle Zusammenlegung, bei welcher die syntaktische Analyse vorzugsweise für deren Inneres bezüglich des äußeren Teils durchgeführt wird und welche in einer äquivalenten Weise als eine Einheit bezüglich deren äußeren Teil behandelt wird. Beispielsweise kann es ein Satzteil, eine Phrase usw. sowie das sein, was den Zwischensymbolen, entspricht, welche in der cfg-Grammatik verwendet worden sind. Es kann auch eine Neststruktur haben, d. h. ein Block kann ferner einen anderen Block enthalten. Ferner kann der Begriff des Blockes auch ein Satz, ein Absatz, und ganze Sätze einschließen, von denen jeder als ein Block betrachtet werden kann. Die Verarbeitung, welche der partiellen syntaktischen Analyse einen Vorzug gibt, wird hier als eine partielle Zergliederung oder Analyse "bezeichnet". Hierdurch können unwirtschaftliche strukturelle Lösungen, die oben beschrieben worden sind, gemindert werden, um den Analysewirkungsgrad zu verbessern, und um ein glaubhafteres Analyseergebnis zu erhalten.A "block" is a structural combination in which the syntactic analysis preferably for its interior is performed with respect to the outer part and which in an equivalent way as a unit with respect to them outer part is treated. For example, it can be a Phrase, a phrase, etc. as well as what the intermediate symbols, which is used in the cfg grammar have been. It can also have a nest structure, i. H. on Block may further include another block. Furthermore, can the notion of the block also a sentence, a paragraph, and whole Include sentences, each considered as a block can be. The processing, which is the partial syntactic analysis is given preference here as one partial dissection or analysis "designated". hereby can be uneconomical structural solutions that be reduced to the analysis efficiency to improve, and a more believable To obtain analysis result.

Für den Block werden in dieser Ausführungsform zwei Merkmale definiert. Eines davon ist ein Symbol in der cfg-Regel, das in der vorliegenden Beschreibung als "Ziel" bezeichnet wird, welches als das Ergebnis der Analyse zusammenzusetzen ist, die bei jeder der konstitutionellen Elemente im Inneren des Blockes durchgeführt worden ist, d. h. ein Symbol, das die Struktur oder die Eigenschaft des Blockes beschreibt. Das andere ist ein Symbol in der cfg-Regel, das als eine "Rolle" bezeichnet wird, welche von dem Block getragen ist, wenn die syntaktische Analyse bei dem Äußeren des Blockes für den Satz, die Phrase oder den Satzteil durchgeführt wird, in welchem der Block enthalten ist. D. h. es ist ein Symbol, das die Beziehung oder die Rolle des Blockes zu den anderen beschreibt. For the block in this embodiment, two features Are defined. One of them is a symbol in the cfg rule, referred to in the present specification as "target" which is composed as the result of the analysis is that in each of the constitutional elements inside of the block has been performed, d. H. a symbol that describes the structure or property of the block. The other is a symbol in the cfg rule that acts as a "Roll", which is carried by the block, if the syntactic analysis at the exterior of the block for the sentence, the phrase or the phrase, in which the block is contained. Ie. it is a symbol that the relationship or the role of the block to the others describes.

Beispielsweise ist im Falle eines englischen Satzes I "White House isn′t white" das Ziel ein Satz, und die Rolle ist ein Hauptwort (ein Satzteil). Obwohl das Ziel und die Rolle in den meisten Fällen im allgemeinen identisch sind, sind sie wie in diesem Fall manchmal verschieden voneinander.For example, in the case of an English sentence I "White House is not white" the goal a sentence, and the Role is a noun (a phrase). Although the goal and the role is generally identical in most cases are, as in this case, sometimes different from each other.

Wenn für die in Fig. 62 dargestellte Ausführungsform die strukturelle Anordnung des eingegebenen englischen Satzes als ein Block erkannt wird, werden die funktionellen Abschnitte, um deren Ziel und deren Rolle zu bewerten, in der Struktur zusammengefaßt, wie in Fig. 65 dargestellt ist. Wie aus Fig. 65 zu ersehen ist, wird die strukturelle Anordnung der englischen Satzarten, welche in dem Abschnitt 9014 vorredigiert worden sind, in dem Abschnitt 9016 für eine morphologische Analyse mit Hilfe des Wort-Wörterbuchs 9018 und der Regeldatei 9036 unterschieden.When the structural arrangement of the input English sentence is recognized as a block for the embodiment shown in Fig. 62, the functional sections for evaluating their destination and their role are grouped in the structure as shown in Fig. 65. As can be seen from Fig. 65, the structural arrangement of the English sentence types pre-edited in the section 9014 is discriminated in the morphological analysis section 9016 using the word dictionary 9018 and the rule file 9036 .

Das Wort-Wörterbuch 9018 speichert Wörterbuch-Informationen für englische Wörter und Phrasen. Beispielsweise werden, wie in Fig. 68 dargestellt, Einträge jeweils mit Variationen für jedes Wort in dieser Ausführungsform ausgebildet und es werden alle Informationen entwickelt. Für die Sprachteil- Information kann beispielsweise eine Anzahl Sprachteil-Informationen vorgesehen sein, wie in Fig. 68 dargestellt ist. Der Weg das Wörterbuch 9018 zu bilden, ist nicht nur auf dieses Beispiel beschränkt.The word dictionary 9018 stores dictionary information for English words and phrases. For example, as shown in Fig. 68, entries each having variations for each word are formed in this embodiment, and all the information is developed. For the part-of-speech information, for example, a number of part-of-speech information may be provided, as shown in FIG . The way to form dictionary 9018 is not limited to this example only.

Die Regeldatei 9036 speichert diese Daten für den oberen Zustand, welcher das obere Ende des Blockes anzeigt, den End-Zustand, der das Ende anzeigt, sowie eine Blockvorbereitungsinformation zum Schaffen des Blockes mit dem Ziel und der Rolle in Form einer Tabelle. Ein Beispiel hierfür ist in Fig. 69 dargestellt. Beispielsweise wird ein Block gestartet durch "Konjunktion" und wird beendet mit dem Ende des Satzes. Folglich wird ein Block gebildet, der von oben mit "," beginnt, was deren Konjunktion unmittelbar voraus geht, und das Ziel ist ein Satzteil bzw. eine Klausel, während die Rolle ein Satz ist. Ferner wird ein weiterer Block, ausgehend von der Konjunktion bis zum Ende des Satzes gebildet, in welchem sowohl das Ziel als auch die Rolle ein Satzteil bzw. eine Klausel sind.The rule file 9036 stores these upper state data indicating the upper end of the block, the End state indicating the end, and block preparation information for creating the block with the destination and the role in the form of a table. An example of this is shown in FIG. 69. For example, a block is started by "conjunction" and ends with the end of the sentence. Thus, a block is formed starting from the top with "," which immediately precedes their conjunction, and the destination is a clause, while the role is a sentence. Further, another block is formed, starting from the conjunction to the end of the sentence, in which both the target and the role are a clause.

Ferner wird ein Block gestartet durch ", Relativpronomen" und wird beendet durch "," oder am Ende des Satzes. Wie in diesem Fall wird die Möglichkeit für eine Anzahl von Endbedingungen für eine obere Bedingung zugelassen. Für den Fall, daß der Block durch "," beendet wird, bildet eine Stelle (cluster) aus ",", das dem Relativpronomen bis zu dem nächsten Auftreten "," vorausgeht, einen Block, in welchem das Ziel ein Satzteil bzw. eine Klausel und die Rolle ein Adverb oder ein Adjektiv ist. Dies bedeutet, daß das Gebilde als eine Adverb- oder eine Adjektiv- Klausel fungiert. In dem Fall einer Endung am Ende des Satzes stellt ein Gebilde aus "," das dem Relativpronomen am Ende des Satzes vorausgeht, einen Block dar, in welchem das Ziel eine Klausel ist, und die Rolle ein Adverb oder ein Adjektiv ist. Dies gibt es entsprechend den Bedingungen zum Ausbilden einer Gruppe, einer Klausel oder eines Satzes, die in üblichen modernen englischen Sätzen erscheinen. In der Fig. stellt das Symbol "" einen Zwischenraum dar. In dem Abschnitt 9016 für eine morphologische Analyse wird der englische Text, welcher von dem Vorredigierungsabschnitt 9014 eingegeben wird, zuerst in Sätze als die Übersetzungseinheiten aufgeteilt. In diesem Fall werden fehlerhaftes Buchstabieren oder nichtregistrierte Worte festgestellt. Das Wörterbuch 9018 wird bei den jeweiligen Satzeinheiten abgerufen und die Wörterbuch-Information für jeden der Bestandteile wird herbeigeholt. Verschiedene Anordnungsmoden werden entsprechend diesen Wörterbuch-Informationen durchgeführt.Further, a block is started by "relative pronoun" and is terminated by "," or at the end of the sentence. As in this case, the possibility for a number of end conditions for an upper condition is allowed. In the event that the block is terminated by "," a cluster ("preceded by the relative pronoun until the next occurrence") forms a block in which the target is a clause Clause and the role is an adverb or an adjective. This means that the entity acts as an adverb or adjective clause. In the case of an ending at the end of the sentence, an entity "," which precedes the relative pronoun at the end of the sentence, represents a block in which the target is a clause, and the role is an adverb or adjective. This is according to the conditions for forming a group, clause, or phrase that appear in standard modern English sentences. In the figure , the symbol "" represents a space. In the morphological analysis section 9016 , the English text input from the pre-editing section 9014 is first divided into sets as the translation units. In this case, spurious spelling or unregistered words are detected. The 9018 dictionary is retrieved at the respective sentence units and the dictionary information for each of the components is retrieved. Various arrangement modes are performed according to this dictionary information.

Fig. 66 zeigt ein Flußdiagramm für die kollektive Zusammenstellung eines Blockes, der in dem Abschnitt 9016 für morphologische Analyse durchgeführt worden ist. Zuerst wird ein Positionszeiger, welcher die Leseposition für einen englischen Satz anzeigt, an dem oberen Ende gesetzt (9100). Das obere Ende bedeutet nicht das Wort am oberen Ende, sondern das obere (imaginäre) Ende des gerade vorangehenden Satzes. Das Wort-Herausnehmen 9101 wird an dieser Stelle ausgeführt. Wie in Fig. 67 dargestellt, wird bei dem Vorgang 9101 ein Wort herausgenommen, indem die Position um eins vorrückt (9111), wenn es nicht am Ende des Satzes steht (9110), und das Wörterbuch 9018 wird für das Wort abgerufen (9112), um die Wortinformation auszuschreiben (9113). Fig. 66 shows a flow chart for the collective compilation of a block performed in the section 9016 for morphological analysis. First, a position pointer indicating the reading position for English sentence is set at the upper end ( 9100 ). The upper end does not mean the word at the top, but the upper (imaginary) end of the preceding sentence. The word extraction 9101 is executed at this point. As shown in Fig. 67, in the process 9101, a word is taken out by advancing the position by one ( 9111 ) if it is not at the end of the sentence ( 9110 ), and the dictionary 9018 is fetched for the word ( 9112 ). to write out the word information ( 9113 ).

Wenn auf diese Weise die Wortinformation bei dem Verarbeitungsvorgang 9101 herausgenommen wird, wird auf die Tabelle 9036 für die Bedingungen am Anfang und Ende des Blockes Bezug genommen, um zu beurteilen, ob es irgendwas gibt, das zu der oberen Bedingung paßt oder nicht (9102). Auf diese Weise werden die Schritte 9101 und 9102 wiederholt, bis das Wort, das zu der oberen Bedingung paßt, festgestellt wird.In this way, when the word information is taken out in the processing operation 9101 , the start and end conditions table 9036 is referenced to judge whether or not there is anything matching the upper condition ( 9102 ). , In this way, steps 9101 and 9102 are repeated until the word matching the upper condition is detected.

Wenn der oberen Bedingung entsprochen ist, werden das nächste Wort und die darauffolgenden Worte durch die geforderte Zahl herausgenommen und es wird Bezug genommen auf die Übereinstimmung mit dem oberen Zustand des Blockes (9104). In diesem Fall wird das Wörterbuch erforderlichenfalls für jedes der Worte abgerufen. Der Positionszeiger wird nicht vorgerückt.If the above condition is satisfied , the next word and the subsequent words are taken out by the required number and reference is made to the match with the upper state of the block ( 9104 ). In this case, the dictionary is retrieved, if necessary, for each of the words. The position indicator is not advanced.

Wenn es mit dem oberen Zustand des Blockes beim Schritt 9104 verglichen wird, dann wird ein Wort, welches mit dem Blockendzustand hinsichtlich des oberen Zustands übereinstimmt, abgerufen (9105). Die Schritte 9104 bis 9106 werden wieder durchlaufen, bis das Wort, das mit der Endbedingung übereinstimmt, gefunden ist. Wenn ein Wort mit der Endbedingung übereinstimmt (9106) wird ein sogenanntes Gebilde (cluster), welches das Wort einschließt, als ein Block erkannt, und der Block wird geschrieben (9107). Insbesondere wird ein Block vorbereitet, um zu beurteilen, ob der Block-Vorbereitungsbedingung an der Stelle genügt ist, wo der Endbedingung zuerst genügt wird. Bezüglich der Tabelle 9036 für eine Block-Vorbereitungsinformation wird die Position für das Wort, das durch den Zeiger an der Stelle angezeigt worden ist, wo sein Vorrücken bei der Verarbeitung 9103 gestoppt wurde, als die obere Position für den Block festgelegt, und die Position des Wortes, das der Endbedingung genügt, die danach zuerst erscheint, wird als die Endposition für den Block festgelegt. Gleichzeitig werden das Ziel und die Rolle des Blockes geschrieben.When it is compared with the upper state of the block at step 9104 , a word matching the block end state with respect to the upper state is retrieved ( 9105 ). Steps 9104 through 9106 are again run through until the word that matches the end condition is found. If a word matches the end condition ( 9106 ), a so-called cluster that includes the word is recognized as a block, and the block is written ( 9107 ). Specifically, a block is prepared to judge whether the block preparation condition at the location where the end condition is satisfied first is satisfied. With regard to the block preparation information table 9036 , the position for the word indicated by the pointer at the position where its advance in processing 9103 was stopped is set as the upper position for the block, and the position of the block Word that satisfies the end condition that appears first after that is set as the end position for the block. At the same time the goal and the role of the block are written.

Wenn als ein Ergebnis einer solchen Blockerkennung " . . ., Konjunktion . . ." beispielsweise in dem englischen Satz steht, wie in Fig. 70 dargestellt ist, wird das Gebilde von dem oberen Ende des Satzes an dem Teil, der "," vorausgeht, als ein Block erkannt, während das Gebilde aus ", Konjunktion" am Ende des Satzes als ein anderer Block erkannt wird. In der Fig. zeigt das Innere von [ ] einen Block an. In diesem Block sind sowohl das Ziel als auch die Rolle Sätze. Ferner bildet ein Gebilde aus dem Wort nach der Konjunktion am Ende des Satzes einen weiteren Block, in welchem sowohl das Ziel als auch die Rolle Sentenzen sind. Andererseits kann das Gebilde aus der Konjunktion am Ende des Satzes als ein Block festgelegt sein. In diesem Fall ist das Ziel ein Satzteil bzw. eine Klausel und die Rolle ist ein Adverb.If, as a result of such block recognition, "..., conjunction ...." For example, in the English sentence, as shown in Fig. 70, the formation from the top of the sentence is recognized as a block at the part preceding "," while the formation is "from, conjunction" at the end of the sentence Sentence is recognized as another block. In the figure , the inside of [] indicates a block. In this block, both the goal and the role are sentences. Further, an entity from the word after the conjunction at the end of the sentence forms another block in which both the goal and the role are sentences. On the other hand, the construct from the conjunction at the end of the sentence can be defined as a block. In this case, the target is a clause and a clause, and the role is an adverb.

Der Block kann auch so festgelegt werden, wie er von der Position aus beginnt, die kein "," enthält. Ferner kann die Interpunktion u. ä. von dem Gegenstand für die grammatikalische Analyse als die Information ausgeschlossen werden, die in dem Block vorhanden ist. Wenn auf dieselbe Weise ". . . Relativpronomen . . ." vorhanden ist, kann ein "Relativpronomen . . ." als ein Block erkannt werden. In diesem Block ist das Ziel eine Klausel oder eine Sentenz und die Rolle ist ein Adverb oder ein Adjektiv. The block can also be set as it is from the Position off that does not contain "," begins. Furthermore, the Punctuation u. ä. from the subject for the grammatical Analysis as the information is excluded which is present in the block. If in the same way ". . . Relative pronoun . . ." can exist, a "relative pronoun. be recognized as a block. In this block the goal is a clause or a sentence and the role is an adverb or an adjective.

Der Block kann natürlich auch in einer Neststruktur ausgeführt sein. Wenn beispielsweise der englische Satz einen derartigen Aufbau hat wie "(Anfang des Satzes" . . ., Konjunktionen, . . ., Relativpronomen . . .") Ende des Satzes, wie beispielsweise in Fig. 71 dargestellt ist, dann stellt das Gebilde von ",Konjunktion" bis zum Ende des Satzes ein Block BL1-BL1, in welchem ", Relativpronomen . . .," als ein anderer Block BL2-BL2 enthalten ist.Of course, the block can also be executed in a nest structure. For example, if the English sentence has such a construction as "(beginning of sentence" ..., conjunctions, ..., relative pronoun ... ") end of the sentence, as shown in, for example, Fig. 71, then the figure of Fig ", Conjunction" until the end of the sentence a block BL1-BL1, in which, "relative pronoun. , ., "as another block BL2-BL2 is included.

Auf diese Weise unterscheidet der morphologische Analyseabschnitt 9016 das Merkmal des Satzes im Hinblick auf die Form und das Vokabular, um die strukturelle Zusammensetzung als einen Block zu unterscheiden. Ferner führt zusätzlich zu einer solchen Blockerkennung der Abschnitt 9016 verschiedene Verarbeitungen durch, wie einen Ausdruck für einen Eigennamen, ein abgeleitetes, unbekanntes Wort, ein abgekürztes Wort, ein numerisches, ein Zeit-, ein mit Bindestrich versehenes Wort, einen Apostroph (′) sowie eine Appositions-Bewertung und eine Verarbeitung für eine Zusatz-Frage, um morphologische Analysedaten vorzubereiten.In this way, the morphological analysis section 9016 discriminates the feature of the sentence in terms of the shape and the vocabulary to distinguish the structural composition as a block. Further, in addition to such block recognition, section 9016 performs various processing such as a proper name phrase, a derived unknown word, an abbreviated word, a numeric, a time, a dashed word, an apostrophe ('), and so on an apposition evaluation and additional question processing to prepare morphological analysis data.

Der englische Satz, der auf diese Weise einer morphologischen Analyse unterzogen worden ist, wird zusammen mit der analysierten Information an den syntaktischen Analyseabschnitt I 9020 übertragen. Fig. 72 zeigt ein Beispiel der ausgegebenen Daten. Fig. 72 das Ergebnis, daß ein englischer Satz I "White House is′nt white" von dem Eingabeabschnitt 9010 eingegeben wird und in dem Abschnitt 9016 morphologisch analysiert wird. Der Block 1 wird an der Wort-Position #4 begonnen und an der Position #10 beendet, an welchen in diesem Fall sowohl das Ziel als auch die Rolle freigestellt sind. Auf diese Weise wird der Block 2 an der Position #5 begonnen und an der Position #6 beendet, wobei das Ziel eine Hauptwortgruppe ist, während die Rolle ein Eigenname ist. Das heißt, der Block "White House is′nt white." enthält in einem anderen Block White House als ein Nest. In einem Block, d. h. einem kleineren Block "White House" fungiert jeder der Bestandteile im Inneren als ein Eigenname, während es eine Position als eine Wortklausel relativ zu dem Äußeren besitzt, d. h. "is′nt white." . "White House" kann als eine Einheit behandelt werden.The English sentence which has been subjected to morphological analysis in this way is transmitted to the syntactic analysis section I 9020 together with the analyzed information. Fig. 72 shows an example of the output data. Fig. 72 shows the result that an English sentence I "White House is not white" is input from the input section 9010 and is morphologically analyzed in the section 9016 . The block 1 is started at word position # 4 and terminated at position # 10, in which case both the target and the reel are free. In this way, block 2 is started at position # 5 and terminated at position # 6, where the target is a main-word group, while the role is a proper name. That is, the block "White House is'nt white." contains White House as a nest in another block. In a block, ie a smaller block "White House", each of the constituents internally functions as a proper name while having a position as a word clause relative to the exterior, ie, "is'nt white." , "White House" can be treated as a single entity.

Zusammen mit einer solchen Blockinformation wird Wortinformation, die aus dem Wort-Wörterbuch 9018 abgerufen worden ist, hinzugefügt und von dem morphologischen Analyseabschnitt 9016 an den syntaktischen Analyseabschnitt I 9020 abgegeben. Der Abschnitt I 9020 analysiert die Oberflächenschicht- Struktur des englischen Satzes durch Anwenden einer kontext-freien Grammatikregel, welche in der Regeldatei 9036 gespeichert ist, um alle möglichen strukturellen Bäume herauszufinden. Wenn in diesem Fall ein Block enthalten ist, wird die oben beschriebene partielle grammatikalische Analyse durchgeführt, während ein Bezug zu der lokalen Analyse gegeben wird. Hierdurch kann der Wirkungsgrad und die Genauigkeit der jeweiligen Analyse verbessert werden.Along with such block information, word information retrieved from the word dictionary 9018 is added and output from the morphological analysis section 9016 to the syntactic analysis section I 9020 . Section I 9020 analyzes the surface-layer structure of the English sentence by applying a context-free grammar rule stored in rules file 9036 to find all possible structural trees. In this case, if a block is included, the partial grammatical analysis described above is performed while referring to the local analysis. This can improve the efficiency and accuracy of the respective analysis.

Insbesondere wird die Block-Einschlußbeziehung von der Positionsinformation für den Block vorbereitet. Dann wird der innerste Block analysiert. Der Block, der mit dieser grammatikalischen Analyse beendet worden ist, wird als eine Einheit betrachtet, und es wird keine weitere Verarbeitung in dessen Inneren durchgeführt. Auf diese Weise wird der Bereich der grammatikalischen Analyse allmählich zu den außenliegenden Blöcken vergrößert. Schließlich ist der ganze Satz analysiert. Das jeweilige Analysieren wird, basierend auf der cfg- Regel in der Weise von oben nach unten und von rechts nach links in dem englischen Satz durchgeführt. Das grammatikalische Analysieren wird in einer Weise durchgeführt, bei der alle Möglichkeiten erhalten bleiben, welche durch die Grammatikregeln zugelassen sind.In particular, the block inclusion relation becomes from the position information prepared for the block. Then the innermost block analyzed. The block with this grammatical Analysis has ended, is considered a unit and there will be no further processing in it Performed inside. In this way the area of the grammatical analysis gradually to the outside Blocks enlarged. Finally, the whole sentence is analyzed. The respective analysis is based on the cfg- Usually in the way from top to bottom and from right to done on the left in the English sentence. The grammatical Analyze is performed in a manner in which all possibilities are preserved, which by the grammatical rules allowed are.

In Fig. 73 ist ein Beispiel für einen solchen Analyse-Verarbeitungsfluß dargestellt. Zuerst werden, basierend auf den englischen Daten, die dem Abschnitt I 9020 zugeführt worden sind, alle strukturellen Zusammensetzungen für einen Satz als Blöcke erkannt, und es werden das Ziel und die Rolle bewertet (9120). Der Weg zu einer Zusammenstellung ist so, wie in Fig. 70 dargestellt. Wenn dann kein Block in einer derartigen Anordnung vorhanden ist (9121), wird der Satz analysiert (9125) und nur ein Satz, der kollektiv als ein Symbol zusammengestellt ist, wird ausgewählt, und die Analyse für den Satz wird beendet (9126). Die Verarbeitungsschritte 9125 und 9126 sind in den Verarbeitungsschritten 9121 bis 9124 eingeschlossen. Fig. 73 shows an example of such an analysis processing flow. First, based on the English data supplied to the section I 9020 , all the structural compositions for a sentence are recognized as blocks, and the objective and role are evaluated ( 9120 ). The way to a compilation is as shown in FIG . Then, if there is no block in such an array ( 9121 ), the sentence is parsed ( 9125 ) and only one sentence collectively assembled as a symbol is selected and the analysis for the sentence is ended ( 9126 ). Processing steps 9125 and 9126 are included in processing steps 9121 to 9124 .

Wenn ein Block vorhanden ist, wird der innerste Block zuerst analysiert (9122). In dem in Fig. 71 dargestellten Beispiel wird das Innere des Blocks BL2-BL2 analysiert. Obwohl verschiedene Lösungen im allgemeinen durch das grammatikalische Analysieren erhalten werden, wird eine Lösung, in welcher der Block kollektiv als ein cfg-Symbol zusammengestellt ist, und welcher mit dem Ziel des Blocks übereinstimmt, die diesen Lösungen ausgewählt (9123). In diesem Fall wird im Hinblick auf das freigestellte Ziel für den Block alles das ausgewählt, was in einem Symbol angeordnet ist. Alles, was auf diese Weise ausgewählt worden ist, wird dann als eine einzige Anordnung behandelt, welche die Rolle für den Block hat (9124). In dem Block mit der optionalen Rolle wird eine Rolle des Symbols, welche bei der Verarbeitung 9123 ausgewählt wird als die Rolle definiert. Die Verarbeitungsvorgänge 9121 bis 9124 werden anschließend nacheinander wiederholt.If there is a block, the innermost block is parsed first ( 9122 ). In the example shown in Fig. 71, the inside of the block BL2-BL2 is analyzed. Although various solutions are generally obtained by the grammatical analysis, a solution in which the block is collectively assembled as a cfg symbol and which coincides with the destination of the block is selected ( 9123 ). In this case, with respect to the blanking target for the block, all that is arranged in a symbol is selected. Anything selected in this way is then treated as a single array having the role for the block ( 9124 ). In the block with the optional roll, a roll of the symbol selected in processing 9123 is defined as the roll. The processing operations 9121 to 9124 are then repeated one after another.

Auf diese Weise werden im Beispiel der Fig. 71 das Innere des Blocks BL2-BL2 als erstes und dann das Innere des Blocks BL1-BL1 grammatikalisch analysiert. In diesem Fall wird der Block BL2-BL2 gleich wie ein einzelnes Wort behandelt, und jeder der darin enthaltenen Bestandteile wird nicht grammatikalisch analysiert. In this way, in the example of Fig. 71, the inside of the block BL2-BL2 is parsed first and then the inside of the block BL1-BL1 is parsed. In this case, the block BL2-BL2 is treated the same as a single word, and each of the constituents therein is not parsed.

Wenn auf diese Weise die Daten, welche die strukturelle Zusammensetzung und die untergeordnete Beziehung festlegen, gehalten werden, werden sie an den syntaktischen Analyseabschnitt II 9022 abgegeben. Die Daten können dann leicht in der Form eines Strukturbaums erkannt werden, wie oben beschrieben ist. Die Daten werden dann weiter in die Struktur des japanischen Satzes in dem Struktur-Umwandlungsabschnitt 9024 und in dem Übersetzungsabschnitt 9026 umgeformt; es wird dann ein übersetzter Satz für jeden der darin enthaltenen Knoten erzeugt. Die Knotenverarbeitung in dem Strukturbau wird in der Weise von oben nach unten und von links nach rechts durchgeführt.When held in this manner, the data defining the structural composition and the subordinate relationship are delivered to the syntactic analysis section II 9022 . The data can then be easily recognized in the form of a tree structure, as described above. The data is then further transformed into the structure of the Japanese sentence in the texture conversion section 9024 and in the translation section 9026 ; a translated sentence is then generated for each of the nodes contained therein. The node processing in the structure is performed in the manner of from top to bottom and from left to right.

Der auf diese Weise erzeugte, übersetzte Satz wird dann einer Nachverarbeitung in dem Nachredigierabschnitt 9030 unterzogen, wird auf dem Bedienungs-Anzeigeabschnitt 9040 visuell dargestellt und beispielsweise als ein japanischer Satz 9034 in dem Ausgabeabschnitt 9032 ausgedruckt.The set so generated, translated is then subjected to post-processing in the Nachredigierabschnitt 9030, is visually displayed on the display section 9040-service printer, for example, when a Japanese sentence 9034 in the output section 9032nd

Auf diese Weise wird gemäß dieser Ausführungsform das Merkmal des englischen Satzes im Hinblick auf die Form und das Vokabular unterschieden, um so die strukturelle Zusammenstellung als einen Block zu unterscheiden. Für den Block werden ein Ziel, welches das Analyseergebnis sein kann, und die strukturelle Rolle, mit welcher der Block nach außen fungiert, bewertet. Dann wird die Oberflächenschicht-Struktur des englischen Satzes analysiert, indem eine kontextfreie Grammatikregel angewendet wird, um alle möglichen Strukturbäume herauszufinden. Hierdurch kann die Anzahl an unwirtschaftlichen Lösungen verringert, der Wirkungsgrad bei der grammatikalischen Analyse verbessert, sowie ein zuverlässigeres Analyseergebnis geschaffen.In this way, according to this embodiment, the feature of the English sentence with regard to the form and the Vocabulary distinction, so the structural composition to distinguish as a block. For the block become a target, which can be the analysis result, and the structural role with which the block faces outward acts, evaluated. Then the surface layer structure becomes of the English sentence analyzed by a context-free Grammar rule is applied to all possible Figure out structure trees. This allows the number of Inefficient solutions reduced, the efficiency at the grammatical analysis improves as well created a more reliable analysis result.

Somit gibt es verschiedene Muster für einen hinzuzufügenden Ausdruck, und es ist schwierig, diese bei der grammatikalischen Analyse, insbesondere bei einer kontextfreien Analyse zu erkennen, da es im Hinblick auf die vorstehende Ausführungsform im allgemeinen schwierig ist, die Appositions-Erkennung nach der Analyse durchzuführen, ist eine zweideutige Übersetzung unvermeidlich. Selbst wenn eine Regel vorbereitet wurde, mit welcher sie zu erkennen sind, würde es ein Risiko sein, das kein hinzuzufügender appositionaler Ausdruck als ein identischer Fall erkannt wird, oder die Anzahl von möglichen Kombinationen wird bezeichnet. Das heißt, es wird eine kostspielige lokale Analyse zwischen den Teilen, welche in dem appositionalen Ausdruck enthalten sind, und den anderen Teilen durchgeführt.Thus, there are different patterns for a to be added Expression, and it is difficult to do this in the grammatical Analysis, in particular for a context-free analysis to recognize, since it is in view of the above embodiment in general, it is difficult to apposition recognition after to carry out the analysis is an ambiguous translation inevitable. Even if a rule has been prepared with which they are recognizable, it would be a risk that no appositional expression to be added as an identical one Case is detected, or the number of possible combinations is called. That means it will be a costly one local analysis between the parts, which in the appositional expression are included, and the other parts carried out.

Im Hinblick darauf kann in der vorliegenden Ausführungsform die Last bei der Verarbeitung in dem Analyseschritt gemildert werden, indem der appositionale Ausdruck durch das Merkmal des Satzes im Hinblick auf die Form oder die semantischen Merkmale von Worten erkannt wird. Eine Bewertung der Apposition wird durchgeführt, indem das nächste Muster als ein Block erkannt wird.In view of this, in the present embodiment alleviated the load during processing in the analysis step be by the appositional expression through the Feature of the sentence in terms of form or semantic Characteristics of words is detected. An evaluation of Apposition is performed by the next pattern as a block is detected.

Für die englische Satzstruktur "∼, Relativpronomen, ∼.∼" wird das Relativpronomen dadurch erkannt, daß der Sprachteilkode für das Wort mit einem speziellen Kode, beispielsweise "R" versehen wird. In diesem Fall wird das Innere, das mit "," umgeben ist, als ein Block unter der Voraussetzung betrachtet, daß er sich nicht mit einem Block oder einer Einheit überschneidet, was in der Vorredigierung angezeigt worden ist, und er nicht "und" oder "oder" in dem Teil nach dem zweiten "," enthält. Für die englische Satzstruktur "∼, Relativpronomen ∼." wird das Innere, das von "," und "," eingeschlossen ist, als ein Block betrachtet. Der Punkt kann ein anderes Symbol sein, welches für das Satzende verwendet wird.For the English sentence structure "~, relative pronoun, ~.~" the relative pronoun is recognized by the fact that the speech sub-code for the word with a special code, for example "R" is provided. In this case, the interior that is with "," is surrounded, considered as a block on condition that that he is not with a block or a unit overlaps what has been displayed in the pre-editing is, and he does not "and" or "or" in the part after the second "," contains. For the English sentence structure "~, Relative pronoun ~. "Becomes the interior of", "and", " is considered as a block. The point can be another symbol used for the end of the sentence becomes.

Zum Durchführen einer solchen Appositions-Bewertung ist das Wörterbuch 9018 entsprechend ausgebildet, um die Bedeutungsinformation für Worte zu speichern. Die Bedeutungsinformation stellt den Unterschied für den Artikel, den Ort, die Person usw. dar, wie in Fig. 74 dargestellt ist. Auch für die Block-Vorbereitungsbedingung ist die Tabelle 9036 so ausgebildet, wie in Fig. 75 dargestellt ist, so daß das obere Ende des Blockes durch "Eigennamen (Person), Hauptwort (Person)" als die obere Bedingung erkannt wird, und das obere Ende des Blockes wird durch "Eigennamen (Person), Artikel Hauptwort (Person)" erkannt. Es ist folglich möglich, den appositionalen Ausdruck im Hinblick auf die morphologischen und semantischen Merkmale zu bewerten, ohne Durchführung einer grammatikalischen Analyse, und um eine grammatikalische Analyse entsprechend der Appositions-Wertung in dem in Fig. 64 dargestellten Beispiel für andere Verarbeitungsvorgänge durchzuführen.To perform such apposition evaluation, the dictionary 9018 is configured to store the meaning information for words. The meaning information represents the difference for the article, location, person, etc., as shown in FIG. 74. Also for the block preparation condition, the table 9036 is formed as shown in Fig. 75 so that the upper end of the block is recognized by "proper name (person), noun (person)" as the upper condition, and the upper one End of the block is identified by "proper name (person), article noun (person)". It is thus possible to evaluate the appositional expression in terms of the morphological and semantic features without performing grammatical analysis and to perform a grammatical analysis according to the apposition score in the example of other processing operations shown in Fig. 64.

Im übrigen sind es in dem englischen Satz diese Gruppen, die eine äußerst spezielle Information tragen und nur begrenzt verwendet werden. Wenn sie auf dieselbe Weise analysiert werden, wie dies für die üblichen Gruppen erfolgt, werden sie bezüglich einer ganz anderen Satzart analysiert, und es ist schwierig, die ursprüngliche Beschaffenheit des Satzes durch Analysieren zu erhalten. Ferner führt dies zu großen Einbußen.By the way, these groups are in the English sentence carry a very specific information and limited be used. When analyzed in the same way will be, as is done for the usual groups they analyzed for a completely different typesetting, and it is difficult, the original texture of the sentence by analyzing. Furthermore, this leads to large Losses.

Beispielsweise wird "let′s" oder "let us" unmittelbar nach der Interpunktion usw. als ein Befehlssatz mit dem kausativen Verb "let" analysiert, und es sollte als eine Gruppe analysiert werden, welche den einladenden Charakter "let" ("laßt uns" u. ä.) hat. "Let" ("lassen") kann auch verschiedentlich als ein transitives Verb gebraucht werden und sogar als ein Hauptwort in der Bedeutung "Vermietung" und ist nicht auf den Gebrauch von freundlichen Einladungen oder Aufforderungen und nicht auf den Gebrauch als Hilfsverb beschränkt. Folglich muß die Analyse für die jeweiligen Möglichkeiten durchgeführt werden, wodurch der Wirkungsgrad herabgesetzt wird. Ferner ist es schwierig, den Gebrauch als Einladung aus dem Ergebnis der Analyse herbeizuführen, da es außer im Hinblick auf die Satzstruktur keinen Unterschied zwischen dem Gebrauch als Kausation und als Einladung gibt, und es ist schwierig dies nur im Hinblick auf die Satzstruktur zu unterscheiden.For example, "let's" or "let us" immediately after punctuation, etc. as a command set with the causative Verb "let" analyzed, and it should be considered a group be analyzed, which the inviting character "let" ("let us," etc.). "Let" can also be different be used as a transitive verb and even as a noun in the meaning of "renting" and is not on the use of friendly invitations or Prompts and not limited to use as an auxiliary verb. Consequently, the analysis must be for the respective possibilities be carried out, reducing the efficiency is lowered. Furthermore, it is difficult to use as To induce invitation from the result of the analysis, as it except in terms of the sentence structure no difference between use as a causation and as an invitation, and it is difficult this only in terms of the sentence structure to distinguish.

Der unwirtschaftliche Verlust im Verlauf einer Analyse kann dadurch gemindert werden, daß "let′s" oder "let us" ("laßt uns") unmittelbar nach der Interpunktion von dem Gegenstand der Analyse ausgeschlossen wird. Durch ein Absondern von dem grundsätzlichen Gebrauch des Worts, d. h. von dem kausativen Gebrauch kann eine semantische Analyse ohne weiteres durchgeführt werden.The uneconomical loss in the course of an analysis can be mitigated by letting let's or let us us ") immediately after the punctuation of the object the analysis is excluded. By separating from the basic use of the word, d. H. from the causative Use can be a semantic analysis readily be performed.

Wenn "please" ("bitte"), "let′s" oder "let us" ("laßt uns") am Anfang des Blockes erscheint, wird ein Flag für die Blockinformation gesetzt, und dies wird für jeden der Fälle in einer Information für die Einheit nicht erteilt. Beispielsweise wird der englische Satz "let′s go to school." ("laßt uns zur Schule gehen") verarbeitet als "go to school" ("zur Schule gehen") woran "let′s" ("laßt uns") angehängt ist.If "please", "let's" or "let us" appears at the beginning of the block, a flag for the Block information is set, and this will be for each of the Cases in a piece of information not granted to the unit. For example, the English sentence "let's go to school." ("let's go to school") processed as "go to school" ("go to school") on which "let's" ("let us") attached is.

Um eine derartige "let"-Verarbeitung durchzuführen, wird ein let-Informations-Verarbeitungsabschnitt 9200 zwischen dem morphologischen Analyseabschnitt 9016 und dem syntaktischen Analyseabschnitt I 9020 in einer Abwandlung dieser Ausführungsform angeordnet. Fig. 77 zeigt die betreffenden Abschnitte. In dieser Fig. sind dieselben Elemente wie diejenigen, die in Fig. 64 dargestellt sind, mit denselben Bezugszeichen bezeichnet.To perform such "let" processing, a let information processing section 9200 is arranged between the morphological analysis section 9016 and the syntactic analysis section I 9020 in a modification of this embodiment. Fig. 77 shows the respective sections. In this figure , the same elements as those shown in Fig. 64 are denoted by the same reference numerals.

Ferner ist das Wörterbuch 9018 entsprechend ausgeführt, um die let-Information für das Wort zu speichern. Wie in Fig. 78 dargestellt, schafft die let-Information "0" für gewöhnliche Worte, "1" für "let′s" und "let us" und "2" für "please".Further, dictionary 9018 is executed to store the let information for the word. As shown in Fig. 78, the let information provides "0" for ordinary words, "1" for "let's" and "let us" and "2" for "please".

Der Abschnitt 9200 für die let-Informationsverarbeitung hat die Aufgabe, das Ergebnis der morphologischen Analyse zusammen mit dem eingegebenen englischen Satz von dem Abschnitt 9016 aufzunehmen und die let-Information als die zusätzliche Information zu der Information eines Wortes während der Analyse hinzuzufügen, wie in Fig. 79 dargestellt ist. In diesem Fall wird ein Block für den Satz angeordnet. In dem in der Figur dargestellten Beispiel ist der Block 0 (Start: 1, Ende: 10, Ziel: Satz, Rolle: Satz). Das heißt, der Block in diesem Beispiel schließt außer einen Satz zusätzlich eine Klausel, eine Gruppe usw. ein. In diesem Fall schließt der Begriff für den Block auch einen Absatz und den ganzen Satz ein, die jeweils als ein Block betrachtet werden können. Ferner wird eine "transitive Verbwurzel (die mit ′s angehängt ist)" als ein Sprachteil zu "let′s" für die Information des Wortes geschrieben, und die let-Information ist "1".The let information processing section 9200 has the task of taking the result of the morphological analysis together with the input English sentence from the section 9016 and adding the let information as the additional information to the information of a word during the analysis, as shown in FIG is shown. 79th In this case, a block is placed for the sentence. In the example shown in the figure, the block is 0 (start: 1, end: 10, destination: sentence, role: sentence). That is, the block in this example includes a clause, a group, etc. in addition to a sentence. In this case, the term for the block also includes a paragraph and the entire sentence, each of which can be considered as a block. Further, a "transitive verb root (appended with 's)" is written as a speech part to "let's" for the information of the word, and the let information is "1".

Wie in Fig. 81 dargestellt, wird eine Verarbeitung zum Vorbereiten eines Blocks des Satzes 9300 vor dem Starten der kollektiven Zusammenstellung des Blocks zu dem eingegebenen englischen Satz durchgeführt. Die anschließenden Verarbeitungsvorgänge sind dieselben, wie die in Fig. 66 dargestellten Flußdiagramme. Beispielsweise wird in dem englischen Satz, ich sagte, "let′s go to school", (ich sagte, "laßt uns zur Schule gehen".) der Block 0 gebildet (Start: Anfang des Satzes, Ende : Ende des Satzes, Rolle : Satz, Ziel : Satz).As shown in Fig. 81, processing for preparing a block of the set 9300 before starting the collective compilation of the block is performed to the input English sentence. The subsequent processing operations are the same as the flowcharts shown in FIG. 66. For example, in the English sentence, I said, "Let's go to school." (I said, "Let's go to school"). Block 0 is formed (Start: Start of sentence, End: End of sentence, Role : Sentence, goal: sentence).

Wie in Fig. 81 dargestellt, wird in dem syntaktischen Analyseabschnitt I 9020 jede der strukturellen Zusammensetzungen als ein Block erkannt, der auf den ihm zugeführten englischen Daten basiert, und deren Ziel und deren Rolle wird entsprechend bewertet (9120). Wenn der Block in der Anordnung nicht vorhanden ist, (9121) wird die Analyse beendet. Wenn Blöcke in dem Eingangssatz vorhanden sind, wird der innerste Block zuerst analysiert (9122). Obwohl im allgemeinen verschiedene Lösungen durch das Analysieren erhalten werden, wird nur die Lösung, welche kollektiv als ein cfg-Symbol angeordnet ist, unter diesen ausgewählt (9123). Die anschließenden Verarbeitungsvorgänge sind dieselben wie die in Fig. 23. As shown in Fig. 81, in the syntactic analysis section I 9020, each of the structural compositions is recognized as a block based on the English data supplied thereto, and its objective and its role are evaluated accordingly ( 9120 ). If the block does not exist in the array, ( 9121 ) the analysis is ended. If there are blocks in the input sentence , the innermost block is parsed first ( 9122 ). Although various solutions are generally obtained by analyzing, only the solution collectively arranged as a cfg symbol is selected among them ( 9123 ). The subsequent processing operations are the same as those in FIG. 23.

Eine derartige Verarbeitung einer "let"-Information wird in dem Abschnitt 9200 entsprechend den in Fig. 83A und 83B dargestellten Verarbeitungsflüssen durchgeführt. Zuerst wird ein Zeiger auf den oberen Block gesetzt (9330), um das Wort zu prüfen, das am oberen Ende oder Anfang des Blockes angeordnet ist (9331). Wenn die let-Information "0" ist, wird der Zeiger schrittweise vorgerückt (9339), um das nächste Wort hinüberzuschaffen.Such processing of "let" information is performed in the section 9200 corresponding to the processing flows shown in Figs. 83A and 83B. First, a pointer to the upper block is set ( 9330 ) to check the word located at the top or beginning of the block ( 9331 ). If the let information is "0", the pointer is advanced ( 9339 ) to pass the next word.

Wenn die let-Information nicht "0" ist, wird die vorhergehende Wörterbuch-Bezugseinheit überprüft (9322). Wenn es keine Interpunktion ist oder wenn der Zeiger nicht das obere Ende oder den Anfang bezeichnet wird der Zeiger schrittweise vorgerückt (9339), um zu dem nächsten Wort überzugehen. Wenn die vorhergehende Wörterbuch-Bezugseinheit bei der Überprüfung eine Interpunktion ist, oder wenn der Zeiger das obere Ende oder den Anfang anzeigt, wird der innerste Schichtblock, welcher das Wort enthält, markiert (9333).If the let information is not "0", the previous dictionary reference unit is checked ( 9322 ). If it is not an punctuation or if the pointer is not the top or the beginning, the pointer is advanced ( 9339 ) to move to the next word. If the previous dictionary reference unit is an punctuation in the check, or if the pointer indicates the top or the beginning, the innermost layer block containing the word is marked ( 9333 ).

Wenn dann die let-Information "1" ist (9334), da unmittelbar nach der Interpunktion dies "let′s" oder "let us" ist, wird die Rolle des markierten Blockes als (Einladungssatz) erkannt (9336). Wenn die Information "2" ist, da es "please" ist, wird die Rolle des markierten Blockes als "Aufforderungssatz" erkannt (9335). Dann wird das Ziel des markierten Blockes als ein Befehlssatz erkannt (9337) und die Wortinformation, welche durch den Zeiger angezeigt worden ist, wird gelöscht (9338). Dann wird der Zeiger schrittweise vorgerückt (9339), um zu dem nächsten Wort überzugehen. Das Verfahren wird bis zu dem Wort an der Endposition durchgeführt (9340).Then, if the let information is "1" ( 9334 ), since immediately after the punctuation, this is "let's" or "let us", the role of the marked block is recognized as (Invitation Set ) ( 9336 ). If the information is "2" because it is "please", the role of the marked block is recognized as "prompt set" ( 9335 ). Then, the destination of the marked block is recognized as an instruction set ( 9337 ), and the word information which has been indicated by the pointer is deleted ( 9338 ). Then the pointer is advanced ( 9339 ) to move to the next word. The process is performed up to the word at the end position ( 9340 ).

Fig. 80 zeigt Beispiele für das Analyseergebnis, mit welchem eine derartige let-Informationsverarbeitung bei dem Beispiel des vorerwähnten Eingabesatzes durchgeführt worden ist: I said, "let′s go to school." (Ich sagte, "laßt uns zur Schule gehen.") Wenn der Abschnitt 9200 die let-Information zu der Information für das Wort hinzufügt, beseitigt er die Information, welche die let-Information betrifft, aus der Tabelle, und die Blockinformation wird als "Imperativsatz" als das Ziel und als ein "Einladungssatz" als die Rolle beschrieben, wie in Fig. 80 dargestellt ist. Fig. 80 shows examples of the analysis result with which such let information processing has been performed in the example of the above-mentioned input sentence: I said, "let's go to school." (I said, "let's go to school.") When the section 9200 adds the let information to the information for the word, it removes the information concerning the let information from the table, and the block information is called "Imperative Theorem" is described as the target and as an "Invitation Set" as the role, as shown in FIG .

Wenn das Wörterbuch 9018 bei der Behandlung von mit Bindestrichen versehenen Worten in dem englischen Satz als die Gesamtheit einer Anzahl von Worten abgerufen wird, welche mit Bindestrich verbunden sind, wird, wenn deren Eintragungen in dem Wörterbuch 9018 vorhanden sind, das Verfahren erfolgreich durchgeführt. Wenn für das mit Bindestrich versehene Wort das nicht in dem Wörterbuch 9018 registriert ist, der ganze Teil als ein unbekanntes Wort behandelt wird, z. B. ein Adjektiv, da die Wörterbuch-Information für jedes der mit Bindestrich verbundenen Worte nicht ausgenutzt werden kann, kann es nicht übersetzt werden. Wenn ferner der Eintrag für die Information für jeden der Bestandteile der mit Bindestrich verbundenen Worte in dem Wörterbuch 9018 vorhanden ist, können sie jedoch nicht vernachlässigt werden.When the dictionary 9018 is retrieved in the treatment of hyphenated words in the English sentence as the entirety of a number of words which are hyphenated, if their entries are present in the dictionary 9018 , the method is successfully performed. If for the hyphenated word that is not registered in the dictionary 9018 , the whole part is treated as an unknown word, e.g. An adjective, since the dictionary information for each of the hyphenated words can not be exploited, it can not be translated. Furthermore, if the entry for the information for each of the constituents of the hyphenated words is present in the dictionary 9018 , they can not be neglected.

Wenn außerdem die Analyse dadurch durchgeführt wird, daß sie in die jeweiligen Bestandteile zerlegt werden, wird die Art eine Verbindung zu mit Bindestrichen versehenen Worten äußerst vielseitig.Moreover, if the analysis is performed by are decomposed into the respective components, the Art connect to hyphenated words extremely versatile.

Um diese Schwierigkeit zu lösen, werden die ganzen mit Bindestrichen versehenen Worte als ein Adjektiv in dem Satz analysiert, und eine entsprechende Analyse wird nur für den inneren Teil der mit Bindestrich versehenen Worte durchgeführt, wobei die Bestandteile der mit Bindestrich versehenen Worte verwendet werden, und das Ergebnis dieser Analyse wird dann kombiniert. Hierdurch ist das Analysieren auch für mit Bindestrich versehene Worte ermöglicht, indem die Information für jeden der Bestandteile verwendet wird. Das heißt, für die mit Bindestrich versehenen Worte, die nicht im Wörterbuch 9018 registriert sind, wird der ganze Teil in gleicher Weise als Adjektiv behandelt. Hierbei wird auf die Substantiva, die durch den Bindestrich verbunden sind, in dem Wörterbuch verwiesen, und die entsprechende Analyse wird in einer geschlossenen Form nur für das Innere der mit Bindestrich verbundenen Worte durchgeführt.To solve this difficulty, the whole hyphenated words are analyzed as an adjective in the sentence, and a corresponding analysis is made only for the inner part of the hyphenated words using the constituents of the hyphenated words, and the result of this analysis is then combined. This also allows parsing for hyphenated words by using the information for each of the constituents. That is, for the hyphenated words that are not registered in dictionary 9018 , the entire part is treated as an adjective in the same way. Here, the nouns associated by the hyphen are referenced in the dictionary, and the corresponding analysis is performed in a closed form only for the interior of the hyphenated words.

Das heißt, wenn das mit Bindestrich verbundene Wort ein in dem Wörterbuch 9018 nicht registriertes Wort ist wird eine Blockinformation abgegeben, daß der ganze Teil als ein Block betrachtet wird, und der Wörterbuchbezug wird für jeden der Bestandteile für das Innere des Blockes durchgeführt, um die jeweiligen Einheiten-Informationen, in welchen der Bindestrich nicht eingeschlossen ist, auszuführen. Für das in dem Bezugs-Wörterbuch nicht-registrierte Wort wird eine Endbewertungs- Verarbeitung zusammen mit einer unbekannten Wortverarbeitung als eine Art von Verarbeitungsvorgang für unbekannte Worte durchgeführt.That is, when the hyphenated word is a word unregistered in the dictionary 9018 , block information is given that the whole part is regarded as one block, and the dictionary reference is performed for each of the constituents for the inside of the block respective unit information in which the hyphen is not included. For the word unregistered in the reference dictionary, final evaluation processing is performed along with unknown word processing as a kind of unknown word processing.

Eine derartige Verarbeitung für mit Bindestrich versehene Worte kann in dem Beispiel der in Fig. 64 dargestellten Struktur durchgeführt werden. In diesem Fall wird die Position des Wortes in dem Satz nicht durch die Zahl, die an dem Wort angebracht ist, sondern durch die Anzahl Zeichen von dem Anfang des Satzes aus, d. h. durch die Zeichenanzahl, ausgedrückt.Such hyphenated word processing may be performed in the example of the structure shown in FIG. 64. In this case, the position of the word in the sentence is expressed not by the number attached to the word but by the number of characters from the beginning of the sentence, that is, by the number of characters.

In Fig. 84 ist ein Beispiel für die Verarbeitung von mit Bindestrich versehenen Worten dargestellt, welche in dem morphologischen Analyseabschnitt 9016 durchgeführt wird. Für den eingegebenen englischen Satz, beispielsweise "The anti-war attitude is her open-door policy" ("Die Anti-Kriegs-Haltung ist ihre Politik der offenen Tür."). Wird der Positionszeiger schrittweise vorgerückt, um ein Wort herauszunehmen (9135) und führt dann ein Wörterbuch-Abrufen durch (9353). In diesem Fall ist der Bindestrich nicht als eine Abgrenzung für das Wort verwendet. Wenn der Eintrag vorhanden ist (9353), wird die Wortinformation ausgeschrieben (9359). In Fig. 84, an example of the hyphenated word processing performed in the morphological analysis section 9016 is shown. For the English sentence entered, for example, "The anti-war attitude is her open-door policy"("The anti-war attitude is her open-door policy."). If the position pointer is advanced to take out a word ( 9135 ), then it performs dictionary retrieval ( 9353 ). In this case, the hyphen is not used as a delimitation for the word. If the entry exists ( 9353 ), the word information is written out ( 9359 ).

Dies wird dann bis zum Ende des Satzes wiederholt.This is then repeated until the end of the sentence.

In dem Fall, daß der Eintrag als ein Ergebnis der Wörterbuch- Referenz 9352 nicht vorhanden ist, falls es kein einen Bindestrich enthaltendes Wort gibt (9354) wird eine Wort-Information ausgeschrieben (9359), während, wenn es ein einen Bindestrich enthaltendes Wort gibt, wird ein Block mit dem Bindestrich geschrieben (9355). In dem Block mit Bindestrich ist die Startposition die Startposition für das mit Bindestrich geschriebene Wort, und die Endposition ist die Endposition für das mit Bindestrich geschriebene Wort. Das Ziel ist freigestellt, und die Rolle ist Adjektiv/Substantiv. Der Bindestrich wird dann herausgelöst, um jedes der Bestandteilworte herauszunehmen (9356) und die jeweiligen Bestandteil- Worte werden aus dem Speicher abgerufen (9357). Die Wortinformationen, welche als das Ergebnis des Wörterbuch-Abrufens erhalten worden ist, (9358) wird geschrieben. Wenn die Wortinformation bei den Schritten 9359 und 9358 ausgeschrieben wird, wird sie im Falle eines nicht im Wörterbuch registrierten Wortes als ein Sprachteil = ein nicht im Wörterbuch registriertes Wort ausgeschrieben.In the event that the entry does not exist as a result of the dictionary reference 9352 , if there is no word containing a hyphen ( 9354 ), word information is spelled out ( 9359 ), whereas if there is a hyphenated word , a block is written with the hyphen ( 9355 ). In the hyphenated block, the start position is the start position for the hyphenated word, and the end position is the end position for the hyphenated word. The goal is optional, and the role is adjective / noun. The hyphen is then extracted to take out each of the constituent words ( 9356 ) and the respective constituent words are retrieved from memory ( 9357 ). The word information obtained as the result of the dictionary retrieval ( 9358 ) is written. When the word information is written out at steps 9359 and 9358 , in the case of a word not registered in the dictionary, it is written out as a speech part = a word not registered in the dictionary.

Fig. 85 zeigt Beispiele für die Blockinformation und die Wortinformation des englischen Blockes, der als ein Ergebnis der Verarbeitung des eingegebenen Satzbeispiels kollektiv zu einem Block zusammengestellt wird. In diesem Beispiel werden mit Bindestrich geschriebene Worte "anti-war" in dem Wörterbuch 9018 registriert, und die Worte "open-door" werden nicht in dem Wörterbuch registriert. Folglich wird der Eintrag für das mit Bindestrich geschriebene Wort "anti-war" als die Information für das Wort ausgeschrieben. Für die mit Bindestrich geschriebenen Worte "open-door" werden sie als "open" und "door" zerlegt und als die Information für die Worte ausgeschrieben, und Block 1 (Start: 30, Ende: 38, Ziel: optional, Rolle: Adjektiv/Substantiv) wird als die Information für den Block ausgeschrieben. Obwohl die Form der englischen Zusatzfrage äußerst begrenzt ist, ist deren Verarbeitung in dem üblichen Analyseverfahren sehr kompliziert. Ferner ist es nicht leicht, das Verb zu bestimmen, auf welches eine sogenannte Zusatzfrage bezogen ist. Nachdem erkannt wird, daß es eine sogenannte Zusatzfrage ist, welche auf dem Merkmal des Satzes im Hinblick auf dessen Form basiert, wird es als eine Zusatzfrage relativ zu der dazugehörigen strukturellen Zusammensetzung behandelt, wodurch das Verb das von der Zusatzfrage betroffen ist, spezifiziert werden kann. Das heißt, der Teil der Zusatzfrage in dem englischen Satz wird als ein strukturelles Muster herausgefunden, und es wird die Analyse durchgeführt, wobei der Teil der Zusatzfrage als reine Information mit einer bestimmten Art einer dazugehörigen strukturellen Zusammensetzung betrachtet wird. Fig. 85 shows examples of the block information and the word information of the English block which is collectively assembled into a block as a result of the processing of the inputted sentence example. In this example hyphenated words "anti-war" are registered in the dictionary 9018 , and the words "open-door" are not registered in the dictionary. Thus, the entry for the hyphenated word "anti-war" is written out as the information for the word. For the hyphenated words "open-door" they are broken down as "open" and "door" and written out as the information for the words, and block 1 (start: 30, end: 38, destination: optional, role: adjective / Noun) is written out as the information for the block. Although the form of the supplementary English question is extremely limited, its processing in the usual analytical procedure is very complicated. Furthermore, it is not easy to determine the verb to which a so-called supplementary question relates. After recognizing that it is a so-called supplemental question based on the feature of the sentence in terms of its shape, it is treated as a supplemental question relative to the associated structural composition, whereby the verb concerned by the supplementary question can be specified , That is, the part of the supplemental question in the English sentence is found out as a structural pattern, and the analysis is performed, the part of the supplementary question being considered as pure information with a certain kind of associated structural composition.

In der vorliegenden Ausführungsform wird eine Einheit oder ein Block in Form eines Symbols beschrieben (eine Ausgangspunktanzeige, die anzeigt, daß dies ein Einheiten- oder Block-Endpunkt ist). In der morphologischen Analyse wird der eingegebene Satztext geformt, in welchem auch die Erkennung für den Block durchgeführt wird. In der vorliegenden Ausführungsform ist "ein Anführungszeichen" als "Q" bezeichnet, und eine Klammer wird als "P" bezeichnet. Beispielsweise wird dies, wie nachstehend angeführt, festgesetzt:In the present embodiment, a unit or a block in the form of a symbol (a starting point display, indicating that this is a unit or Block endpoint is). In morphological analysis, the formed sentence text, in which also the recognition for the block is performed. In the present embodiment is "a quote" denoted as "Q", and a bracket is referred to as "P". For example this shall be determined as indicated below:

′. . . .′  durch (Q′. . . .)′,
". . . ."  durch (Q". . . .)",
(. . . .)  durch ((P. . . .)),
<. . . .<  durch < (P. . . .) <,
{. . . .}  durch {(P. . . .)} und
[. . . .]  durch [(P. . . .)] bzw.'. , , 'By (Q' ...) ',
"...." by (Q "...)",
(...) by ((P. ..)),
<. , , <by <(P. ..) <,
{. , , .} by {(P....}} and
[. , , .] by [(P....

Die Blockerkennung wird auf die gleiche Weise durchgeführt.The block detection is performed in the same way.

Das Start- und das Endsymbol des Blockes werden nur unter dem Kontext angewendet, wo der Block durch diese Symbole geöffnet oder geschlossen wird. Der Teil unmittelbar vor dem Startsignal und unmittelbar nach dem Endsignal sollten andere als alphanumerische Symbole sein. Die vorstehenden Symbole, die dem nicht entsprechen, werden als reine Symbole behandelt. Die Blöcke können mehrmals ineinandergesetzt sein, vorausgesetzt, daß sie sich nicht kreuzen oder überschneiden.The start and the end symbol of the block are only under applied to the context where the block is opened by these icons or closed. The part immediately before Start signal and immediately after the end signal should others be as alphanumeric symbols. The preceding symbols, which do not correspond to, are treated as pure symbols. The blocks can be interlocked several times, provided that they do not cross or overlap.

Wenn beim Verarbeiten der Zusatzfrage die folgenden Wortgruppen in dem Augenblick nachfolgen, wo der Zeiger "," anzeigt, wird das Gebilde (cluster) nach "," bis "?" als eine Einheit gelöscht und ein Flag wird als ein Block gesetzt. Das heißt, die Form des Zusatzfragen-Satzes schließt ein:If, when processing the supplementary question, the follow the following phrases at the moment when the Pointer "," indicates, the entity (cluster) becomes "," until "?" deleted as a unit and a flag is considered a Block set. That is, the form of the supplementary question set includes:

", (Hilfsverb) + (Personalpronomen)?"
", (Hilfsverb) n′t + (Personalpronomen)?"
", (Hilfsverb) + (Personalpronomen) + nicht?"", (Auxiliary) + (personal pronoun)?"
", (Auxiliary) not + (personal pronoun)?"
", (Auxiliary) + (personal pronoun) + not?"

Ferner schließen die verschiedenen Hilfsverbarten folgendes ein: am, is, are, was, were, do, does, did, have, has, had, will, shall, would, should, can, cannot, could, may, might, must, ought, won′t, shan′t, need, dare, used. Die Arten des Personalpronomens schließen I, you, he, she, it, we, they ein.Furthermore, the various auxiliary embodiments include the following a: am, is, are, what, were, do, does, did, have, has, had, wants, should, could, should, can, can not, could, may, might, must, ought, will not, shall not, need, dare, used. The species of personal pronouns close I, you, he, she, it, we, they are one.

Diese werden als die Information in der innersten Blockschicht, zu der sie gehören, verwendet. Beispielsweise wird in einem englischen Satz: you said so, didn′t you?, der ganze Teil als ein Block im Hinblick auf die strukturelle Zusammensetzung in [You said so,] < mit Zusatzfrage < behandelt. Ähnlich wird in dem englischen Satz: I said, "You said so didn′t you?", der zitierte Satz "You said so didn′t you?" als ein Block 1 im Hinblick auf die strukturelle Zusammensetzung erkannt, und ferner wird der ganze Teil als ein Block 2 im Hinblick auf die strukturelle Anordnung erkannt. Das heißt, [I said, [You said so,] < versehen mit einer Zusatzfrage <. These are called the information in the innermost block layer, to which they belong. For example in an English sentence: you said so, did not you ?, the whole part as a block in terms of structural composition in [You said so,] <with supplementary question <treated. Similarly, in the English sentence: I said, "You said so did not you? ", the sentence quoted" You said so did not you? " as a block 1 in terms of structural composition recognized, and further, the whole part as a Block 2 recognized in terms of structural arrangement. That is, [I said, [you said so,] <provided with a supplementary question <.

Das Abkürzungswort wie "didn′t" wird behandelt, nachdem es entsprechend einer vorherbestimmten Tabelle in eine vollständig ausgeschriebene Form entwickelt worden ist. Für das Wort, das eine Anzahl entwickelter Formen hat, werden diese alle ausgegeben.The abbreviation word like "did not" is treated after it according to a predetermined table in a complete tendered form has been developed. For the Word that has a number of developed forms become these all spent.

Um eine derartige Verarbeitung für eine sogenannte Zusatzfrage durchzuführen, ist ein entsprechender Abschnitt 9210 zwischen dem morphologischen Analyseabschnitt 9016 und dem syntaktischen Analyseabschnitt I 9020 in einer anderen Modifikation der Erfindung angeordnet. Fig. 87 stellt die betreffenden Abschnitte dar. In diesen Figuren sind die Elemente, welche mit den in Fig. 64 identisch sind, mit denselben Bezugszeichen bezeichnet.In order to perform such processing for a so-called supplementary question, a corresponding section 9210 is arranged between the morphological analysis section 9016 and the syntactic analysis section I 9020 in another modification of the invention. Fig. 87 shows the respective portions. In these figures, the elements which are identical to those in Fig. 64 are denoted by the same reference numerals.

Der Abschnitt 9210 erhält das Ergebnis der morphologischen Analyse zusammen mit dem eingegebenen englischen Satz von dem hierfür vorgesehenen Abschnitt 9016, und wie in Fig. 88 dargestellt, wird ein Block für den Satz angeordnet. In den in der Figur dargestellten Beispielen ist der Block 0 (Starten: 1, Ende: 12, Ziel : Satz, Rolle : Rolle). In diesem Fall wird in dieser modifizierten Ausführungsform das Wort durch die Zahl für das Wort dargestellt. In dieser modifizierten Ausführungsform enthält der Block beispielsweise einen Satz zusätzlich zu einer Klausel und einer Gruppe. In diesem Fall schließt der Begriff des Blocks auch einen Absatz und den ganzen Satz ein, die jeweils als ein Block betrachtet werden können.The section 9210 obtains the result of the morphological analysis along with the input English sentence from the dedicated section 9016 , and as shown in Fig. 88, a block for the sentence is placed. In the examples shown in the figure, the block is 0 (start: 1, end: 12, destination: sentence, role: role). In this case, in this modified embodiment, the word is represented by the number for the word. For example, in this modified embodiment, the block includes a sentence in addition to a clause and a group. In this case, the notion of the block also includes a paragraph and the whole sentence, each of which can be considered as a block.

Die kollektive Zusammenstellung des Blocks für den eingegebenen englischen Satz einschließlich der Zusatzfrage kann dieselbe sein wie die in dem vorstehend beschriebenen, und in Fig. 81 dargestellten Flußdiagramm. Das heißt, die Verarbeitung 9300 zum Vorbereiten des Blockes des Satzes wird vor dem Beginn der Verarbeitung durchgeführt. Beispielsweise wird in einem englischen Satz I said, "It is good, isn′t it?" Block 0 gebildet (Start: Anfang des Satzes, Ende: Ende des Satzes, Rolle : Satz, Ziel : Satz). In dem syntaktischen Analyseabschnitt I 9020 wird eine Analyse mittels desselben Flußdiagramms durchgeführt, wie es in Fig. 82 dargestellt ist. Ein Verarbeiten in dem die Zusatzfrage verarbeitenden Abschnitt 9210 wird anhand von Fig. 90A und 90B erläutert. Zuerst wird ein Zeiger auf das Wort an dem oberen Ende der Wortinformation gesetzt (9370). Wenn es kein Komma ist, wird der Zeiger schrittweise vorgerückt (9384). Dies wird bis zum Ende des Satzes wiederholt (9371). Dann wird geprüft, ob das Wort, das dem Komma am nächsten ist, ein Wort ist, das zu der α-Gruppe gehört, oder ein Wort, das zu der β-Gruppe gehört, während der Zeiger an der Stelle gelassen wird, an der er ist (9373, 9379). In diesem Fall wird dann bestimmt, daß das, was zu einem Hilfsverb gehört, oder ein Verb sein in dem Sprachteil ist und nicht in negativer Form vorliegt, Worte sind, die zu der α-Gruppe gehören, während das, was die negative Form des Hilfsverbs einschließt oder die negative Form eines Verbs sein in dem Sprachteil Worte sind, die zu der β-Gruppe gehören. Wenn das Wort zu keiner der Gruppen gehört, wird der Zeiger schrittweise vorgerückt (9484) und die Vorgänge werden bis zum Ende des Satzes wiederholt (9371).The collective composition of the block for the input English sentence including the supplementary question may be the same as that in the flowchart described above and shown in FIG . That is, the processing 9300 for preparing the block of the sentence is performed before the start of the processing. For example, in an English sentence, I say, "It is good, is not it?" Block 0 formed (start: beginning of the sentence, end: end of the sentence, role: sentence, goal: sentence). In the syntactic analysis section I 9020 , analysis is performed by the same flowchart as shown in FIG. 82. Processing in the supplementary question processing section 9210 will be explained with reference to Figs. 90A and 90B. First, a pointer to the word at the top of the word information is set ( 9370 ). If it is not a comma, the pointer advances step by step ( 9384 ). This is repeated until the end of the sentence ( 9371 ). Then, it is checked if the word closest to the comma is a word belonging to the α group or a word belonging to the β group while the pointer is left in place he is ( 9373, 9379 ). In this case, it is then determined that what belongs to an auxiliary verb or verb is in the speech part and is not in negative form are words belonging to the α group, while what the negative form of the Auxiliary verb or the negative form of a verb in the speech part are words belonging to the β-group. If the word does not belong to any of the groups, the pointer is advanced ( 9484 ) and the operations are repeated until the end of the sentence ( 9371 ).

Für den Fall, daß das Wort zu der α-Gruppe gehört, wird der Schritt 9384, bei welchem der Zeiger vorrückt, durchgeführt, wenn das Wort, das den Worten der α-Gruppe am nächsten ist, kein Pronomen ist. Wenn es ein Pronomen ist, wird geprüft, ob das nächste Wort "nicht" ist oder nicht (9375) und wenn es nicht "nicht" ist, wird geprüft, ob das Wort, das Pronomen am nächsten ist, ein Fragezeichen ist oder nicht (9377). Wenn es kein Fragezeichen ist, wird der Schritt 9384 durchgeführt. Wenn es das Fragezeichen ist, wird das Ziel wieder in einen "negativen Satz" und die Rolle in den "Zusatzfragen- Satz" für den innersten Schichtblock geschrieben (9378) und das ",. . .?" wird aus der Informationstabelle für das Wort gelöscht (9383). Der innerste Schichtblock bezeichnet solche Blöcke, welche den Bedingungen genügen: Startposition ≦ (Position für ",") und die auch der Bedingung genügen:
Endposition ≧ (Position für "?") für die Blockposition und mit der minimalen (Endposition-Startposition).In the case where the word belongs to the α group, the step 9384 at which the pointer advances is performed when the word closest to the words of the α group is not a pronoun. If it is a pronoun, it is checked whether the next word is "not" or not ( 9375 ) and if it is not "not", it is checked if the word closest to the pronoun is a question mark or not ( 9377 ). If it is not a question mark, step 9384 is performed. If it is the question mark, the target is again written in a "negative sentence" and the role in the "Supplementary Question Set" for the innermost shift block ( 9378 ) and the ",. is deleted from the information table for the word ( 9383 ). The innermost layer block designates such blocks which satisfy the conditions: start position ≦ (position for ",") and which also satisfy the condition:
End position ≧ (position for "?") For the block position and with the minimum (end position start position).

Wenn das Wort, das dem Pronomen am nächsten steht, bei dem Schritt 9375 "nicht" ist, wird geprüft, ob das Wort, das dem "nicht" am nächsten ist, ein Fragezeichen ist oder nicht (9376). Wenn es kein Fragezeichen ist, wird der Schritt 9384 ausgeführt, d. h. der Zeiger rückt vor. Wenn es das Fragezeichen ist, wird das Ziel für den innersten Schichtblock in einen "Betätigungssatz" geschrieben, während die Rolle in einen "Zusatzfragen-Satz" geschrieben wird (9382), und ",. . .?" wird aus der Wortinformationstabelle (9383) gelöscht.If the word closest to the pronoun is "not" at step 9375 , it is checked whether the word closest to "not" is a question mark or not ( 9376 ). If it is not a question mark, step 9384 is executed, ie the pointer advances. If it is the question mark, the destination for the innermost layer block is written in an "operation set" while the role is written in a " supplementary question sentence" ( 9382 ), and ",. is deleted from the word information table ( 9383 ).

Wenn beim Schritt 9379 das Wort, das dem Komma am nächsten ist, ein Wort ist, das zu der β-Gruppe gehört, wird der Schritt 9384 durchgeführt, wenn das Wort, das der β-Gruppe am nächsten ist, kein Pronomen ist. Wenn es ein Pronomen ist, wird geprüft, ob das Wort, das ihm am nächsten steht, ein Fragezeichen ist (9381) oder nicht, und es wird in den Schritt 9384 eingetreten, wenn es kein Fragezeichen ist. Wenn es das Fragezeichen ist, wird das Ziel des innersten Schichtblocks in einen "Bestätigungssatz" geschrieben, während die Rolle in eine "Zusatzfragen-Markierung" geschrieben wird (9382), und ", . . .?" wird aus der Wortinformationstabelle gelöscht (9383). Dann wird der Zeiger schrittweise vorgerückt (9384), und die Vorgänge werden bis zum Ende des Satzes wiederholt.If, at step 9379, the word closest to the comma is a word belonging to the β group, step 9384 is performed if the word closest to the β group is not a pronoun. If it is a pronoun, it is checked if the word closest to it is a question mark ( 9381 ) or not, and step 9384 is entered if it is not a question mark. If it is the question mark, the destination of the innermost layer block is written in a "confirmation sentence " while the role is written in a " supplementary question mark" ( 9382 ), and ",. is deleted from the word information table ( 9383 ). Then the pointer is incremented ( 9384 ) and the operations are repeated until the end of the sentence.

Beispielsweise zeigt Fig. 88 die Information für die Blöcke und die Worte, welche von dem morphologischen Analyseabschnitt 9016 an dem die Zusatzfrage verarbeitenden Abschnitt 9210 für den englischen Satz: I said, "It is good, isn′t it?" erhalten worden ist. Die Blockinformation für den Block 1 ist (Start: 4, Ende: 12, Ziel: optional, Rolle: optional). Wenn dies der Zusatzfragen-Verarbeitung in dem Abschnitt 9210 unterworfen wird, wird die Blockinformation für den Block 1 wieder geschrieben in (Start: 4, Ende: 12, Ziel: Bestätigungssatz, Rolle: Zusatzfragensatz) und gleichzeitig wird die Wortinformation, welche die Zusatzfrage #8-#11 betrifft, gelöscht.For example, Fig. 88 shows the information for the blocks and the words which are sent from the morphological analysis section 9016 to the supplementary question processing section 9210 for the English sentence: "It is good, is not it?" has been obtained. The block information for block 1 is (start: 4, end: 12, destination: optional, role: optional). When subjected to the supplemental question processing in the section 9210 , the block information for the block 1 is rewritten in (start: 4, end: 12, destination: confirmation sentence, role: additional question sentence) and at the same time, the word information which is the supplementary question # 8- # 11 concerns deleted.

Claims

A speech analyzer, comprising dictionary means containing morphological data of a particular language and additional data enabling multiple word combinations, comprising: first analyzing means for performing a morphological and syntactic analysis of an input string of said language using said morphological data, and with a second analysis device for performing a syyntaktischen analysis of the input string (word sequence) on the basis of the analysis result of the first analysis device, wherein

a) the first analysis means ( 1016 ) is adapted to define a syntactic block using said additional data and the morphological data and to size syntactically related morphemes using the additional data and the morphological data, and
b) the second analysis means ( 1020 ) is arranged to parse the string (word sequence) within the syntactic block and to treat the result of the structuring process or the associated morphemes as a single morpheme.

The speech analyzer according to claim 1, characterized in that said first analysis means ( 1016 ) is adapted for detecting a high degree of coupling for a number of combinations between a word retrieved in the dictionary means and other words stored in the dictionary means during the analysis To prepare preference data to give preference to a number of combinations for subsequent processing.

The speech analyzer according to claim 1, characterized in that said dictionary means comprises a basic unit storage section ( 2026 ) for storing therein dimension units, and that said first analyzing means addresses said memory section ( 2026 ) with said basic units for said inputted character string, thereby morphologically to analyze whether the string expresses a dimension unit or not.

The speech analyzer according to any one of claims 1 to 3, characterized in that the first analysis means ( 1016 ) has a pointer settable to a character at the front of the input string, and when a portion of the string is retrieved at fetching the dictionary means for retrieving the string starting from the character at which the pointer has been set, the pointer being settable on a string following the part of the retrieved string, the dictionary means being then addressable for the subsequent string to which the pointer is set.

The speech analyzer according to any one of claims 1 to 4, characterized in that said dictionary means stores dictionary data stored therein at respective dictionary reference units, and said first analyzing means ( 1016 ) divides said input sentence into dictionary reference units and morphological analysis for the dictionary reference units, referring to the dictionary means, wherein the dictionary means stores as the dictionary data a discrimination notice indicating that the dictionary reference units represent numbers, and when the discrimination notice is included in the retrieved dictionary data and a dictionary reference another dictionary reference unit is retrieved with a different discrimination notice located near the former dictionary reference unit, a numerical value of the two dictionary reference units by the second analysis facility ( 1020 ) is formed.

The speech analyzer according to any one of claims 1 to 5, characterized by feature information creating means (in 1020 ) for generating replacement feature information at a string not recorded in the dictionary means.

The speech analyzer according to claim 6, characterized in that said feature information generating means (in 1020 ) is adapted to, with two consecutive strings (word sequences), one string (word string) having no feature information, compare them with the feature information of the other string (Word sequence) having feature information.

A speech analyzer according to any one of claims 1 to 7, characterized in that when the string starts from a capitalized character and the preceding string (phrase) is at the end of the sentence, the first analyzer ( 1016 ) generates the capital letters at the beginning of the string are converted to a lower case letter and then addressed to the dictionary device by means of a fetch device for that string, and if not registered in the dictionary device, the string is parsed as a non-registered proper name.

A speech analyzer according to claim 1, characterized in that the dictionary means comprises data for distinguishing dictionary reference units from a particular semantic element, the first analyzing means ( 1016 ) having a corresponding discrimination table for determining whether a sequence of dictionary reference units forming, with the specific semantic element, a composite unit expressing a specific meaning composed according to a certain rule, and the first analyzing means ( 1016 ) retrieving the dictionary means with respect to the respective dictionary reference units contained in the input sentence; and form the sequence of dictionary reference units with the specific semantic element into a single analysis unit, corresponding to the corresponding discrimination table.

The speech analyzer of claim 1, characterized by control means ( 1038 ) for retrieving the dictionary means ( 1018 ) and controlling the first and second analysis means ( 1016, 1020 ) to perform respective analyzes.

The speech analyzer according to claim 1 or 10, characterized in that said first analysis means ( 1016 ), when no dictionary data is included, when said dictionary means ( 1018 ) is retrieved for a number of hyphenated words, said dictionary means ( 1018 ) for each of said Retrieving the number of words, evaluating the whole part of the number of words as a compilation, and then evaluating the arrangement of the compilation as an adjective group.