NL8802271A

NL8802271A - Grammar processor for natural language sentences

Info

Publication number: NL8802271A
Application number: NL8802271A
Authority: NL
Original assignee: Oce Nederland Bv
Priority date: 1988-09-15
Filing date: 1988-09-15
Publication date: 1990-04-02

Abstract

The grammatical processing is performed by measures applied separately to each word unit of the sentence. The measures applied are as follows:- A. For each word unit and each constituent, its functional word category within the constituent is determined by referring to data concerning the verbal category of the word unit and category of the constituent. B. Finding a step leading to closure of the constituent by reference to data concerning the dominant category of the constituent. C. Testing the current constituent against grammar based rules regarding the context of words/constituents and, if required, assessing the probability factor of the sentence representation.

Description

Werkwijze voor het ontleden van een, in natuurlijke taal gestelde zin, in, met functionele indicaties te omschrijven zinsdelen, alsmede een inrichting voor het uitvoeren van een dergelijke werkwijzeMethod for parsing a phrase, expressed in natural language, into phrases to be described with functional indications, as well as a device for carrying out such a method

De uitvinding heeft betrekking op een werkwijze voor het ontleden van een, in natuurlijke taal gestelde zin in, met functionele indicaties te omschrijven zinsdelen aan de hand van woordeenheden welke naar verbale categorieën en applicatie-gerichte categorieën zijn gelexica-liseerd, alsmede op een inrichting voor het uitvoeren van een dergelijke werkwijze.The invention relates to a method for parsing a phrase, expressed in natural language, into phrases to be described with functional indications on the basis of word units which have been indexed into verbal categories and application-oriented categories, and to a device for performing such a method.

Een dergelijke werkwijze, welke in vakkringen met de naam "parser" wordt aangeduid, is bekend uit: Allen, James; Natural Language Understanding. The Benjamin/Cummings Publishing Company, Ine., Menlo Park, U.S.A. 1987.Such a method, which is referred to in professional circles as "parser", is known from: Allen, James; Natural Language Understanding. The Benjamin / Cummings Publishing Company, Ine., Menlo Park, U.S.A. 1987.

De aldaar beschreven parsers bezitten een zogenaamd "rewriting mechanism", hetgeen inhoudt, dat de parsers een zinsontleding bewerkstelligen aan de hand van een groot aantal herschrijfregels.The parsers described there have a so-called "rewriting mechanism", which means that the parsers perform a parsing according to a large number of rewriting rules.

Deze herschrijfregels leggen een verband tussen een groep woorden en/of zinsconstituenten enerzijds en een parent constituent, dat wil zeggen, een deze groep dominerende constituent anderzijds. Het aantal herschrijfregels hangt daarbij af van de omvang van het te hanteren beschrijvingsmechanisme, dat ten grondslag ligt aan de parser; daarbij wordt het beschrijvingsmechanisme op zijn beurt weer bepaald door de syntax en de morfologie van de taal, hetgeen beperkingen aan de parser oplegt wat betreft haar vermogen om bij een niet grammaticale invoer toch nog met een oplossing voor de zinsontleding te komen. Slechts door een zeer uitgebreid scala aan herschrijfregel s in de parser op te nemen, is het mogelijk om onjuist gestelde, alsmede minder gebruikelijke zinsconstructies te kunnen ontleden. Dit heeft weer tot gevolg, dat de voor de herschrijfregels bestemde geheugenruimte van de computer, waarop de parser draait, zeer omvangrijk dient te zijn, terwijl de tijdsduur om een dergelijke zin te ontleden lang mag worden genoemd. Bovendien zal het daarbij zeer moeilijk zijn om niet-grammaticale expressies te detecteren. De uitvinding beoogt deze nade- 1 en in verregaande mate op te lossen.These rewriting rules link a group of words and / or sentence constituents on the one hand, and a parent constituent, that is, a constituent that dominates this group on the other. The number of rewriting rules depends on the size of the description mechanism to be used, which underlies the parser; the descriptive mechanism is in turn determined by the syntax and morphology of the language, which imposes restrictions on the parser in terms of its ability to come up with a solution to the sentence in case of a non-grammatical input. Only by including a very wide range of rewriting rules in the parser is it possible to parse incorrectly formatted as well as less common sentence constructions. As a result, the memory space of the computer on which the parser runs is intended to be used for the rewrite rules, while the time taken to parse such a sentence may be long. In addition, it will be very difficult to detect non-grammatical expressions. The object of the invention is to solve these drawbacks to a great extent.

De uitvinding berust op het inzicht de ontleding te binden aan regels ("right associate grammar mies") om een woord binnen de bestaande, dan wel nieuw te vormen constituent in de reeds verkregen zinsdeel representatie in te passen onder toekenning van een zekere functionaliteit op woordniveau (functionele woordcategorie) op basis van tenminste een, bij dat woord passende verbale categorie, en aan regels ("constituent closure grammar rules") om de mogelijkheid te onderzoeken om een constituent af te sluiten onder toekenning van een al dan niet voorlopige functionaliteit op constituent niveau (functionele constituent categorie), waarbij tijdens de ontleedprocedure aan de hand van een onderzoek naar samenhang tussen de diverse woorden en/of constituenten binnen eenzelfde constituentniveau een verwacht!ngsfac-tor voor de zin wordt bij gehouden, aan de hand waarvan de meest voor de hand liggende zinsrepresentatie kan worden geselecteerd. Overeenkomstig de uitvinding is de werkwijze van de in de aanhef omschreven soort gekenmerkt door de volgende per woordeenheid uit de zin uit te voeren stappen: a. Het per woordeenheid en per constituent bepalen van de functionele woordcategorie binnen de constituent en/of binnen een nieuw aan te maken constituent, aan de hand van gegevens over de verbale categorie van de betreffende woordeenheid, en de categorie van de constituent; b. Het per constituent aan de hand van gegevens over de categorie van de constituent en de categorie van de boven deze constituent dominerende constituent omschrijven van een de afsluiting van de constituent betreffende maatregel, alsmede het toekennen van een al dan niet voorlopig functioneel label aan een alsdan af te sluiten constituent; c. Het toetsen van de onderhavige constituent aan, op syntax-regels gebaseerde voorschriften betreffende de samenhang der woorden en/of constituenten binnen deze constituent bij tenminste één van beide laatstgenoemde stappen, en, indien nodig, het herwaarderen van een, aan de zinsrepresentatie toegekende verwacht!ngsfactor, alsmede het selecteren van iedere zinsrepresentatie waarvan de verwacht!'ngsfactor boven een zekere drempel waarde ligt.The invention is based on the insight to bind the parsing to rules ("right associate grammar") in order to fit a word within the existing or newly to be formed constituent in the already obtained phrase representation, while granting a certain functionality at word level. (functional word category) on the basis of at least one verbal category appropriate to that word, and rules ("constituent closure grammar rules") to investigate the possibility of closing a constituent by granting a functionality, whether provisional or not, on constituent level (functional constituent category), during which during the parsing procedure an expectation factor for the sentence is maintained on the basis of an investigation into the relationship between the various words and / or constituents within the same constituent level, on the basis of which the most common obvious sentence representation can be selected. According to the invention, the method of the type described in the preamble is characterized by the following steps to be performed per word unit from the sentence: a. Determining the functional word category per word unit and per constituent within the constituent and / or within a new context. constituent to be created, based on information about the verbal category of the word unit concerned and the category of the constituent; b. Describing for each constituent, on the basis of information about the category of the constituent and the category of the constituent dominating above this constituent, a closure relating to the measure relating to the measure, as well as the granting of a functional label, whether or not provisionally, constituent to be closed; c. Testing the present constituent against syntax-based rules regarding the coherence of words and / or constituents within this constituent in at least one of the latter two steps, and, if necessary, revalue an expectation assigned to the sentence representation! ngs factor, as well as selecting any sentence representation whose expected ngs factor is above a certain threshold.

De in een eerste uitvoeringsvorm te beschrijven parser is een parser met een externe grammatica (syntax directed parser) waarbij de grammaticale regels buiten het eigenlijke programmagedeelte zijn gehouden, en in geheugendelen of geheugenmodules zijn opgeborgen. Dit biedt de mogelijkheid tot wijzigen van de grammaticale inhoud buiten het programma om. Voorts kan door slechts de inhoud van de geheugendelen of geheugenmodulen te wijzigen, de parser voor een andere natuurlijke taal geschikt worden gemaakt. De grammaticale regels hebben een dusdanige vorm dat het mogelijk is om met een beperkt aantal regels al een groot aantal mogelijke structuren te verwerken.The parser to be described in a first embodiment is a parser with an external grammar (syntax directed parser) in which the grammatical rules are kept outside the actual program part, and are stored in memory parts or memory modules. This offers the possibility to change the grammatical content outside the program. Furthermore, by only changing the contents of the memory parts or memory modules, the parser can be made suitable for another natural language. The grammatical rules have such a form that it is possible to process a large number of possible structures with a limited number of rules.

Bovendien zal deze parser bij niet grammaticale invoer over het algemeen nog steeds met een oplossing komen en zelfs aangeven wat voor soort fout gemaakt is. Dit maakt de parser geschikt om bijvoorbeeld binnen een editor te gebruiken.Moreover, this parser will generally still come up with a solution for non-grammatical input and even indicate what kind of error has been made. This makes the parser suitable for use within an editor, for example.

Behalve bovengenoemde uitvoeringsvorm van een parser met een externe grammatica is natuurlijk ook een parser met een interne of geïntegreerde grammatica (syntax embedded parser) mogelijk door de grammaticaregels binnen de parser onder te brengen. Dit kan resulteren in een snellere parser.In addition to the above-mentioned embodiment of a parser with an external grammar, a parser with an internal or integrated grammar (syntax embedded parser) is of course also possible by accommodating the grammar rules within the parser. This can result in a faster parser.

De uitvinding zal nader worden uiteengezet aan de hand van bijgaande figuren, waarvan:The invention will be explained in more detail with reference to the annexed figures, of which:

Fig. 1 middels een flow-diagram een algemeen overzicht van de bij het parseren van een tekst te voeren werkwijze weergeeft;Fig. 1 shows by means of a flow diagram a general overview of the working method to be followed when parsing a text;

Fig. 2 een flow-diagram toont van het basisprincipe van de parser;Fig. 2 shows a flow chart of the basic principle of the parser;

Fig. 3 in een flow-diagram een eerste uitvoeringsvorm toont van het kerngedeelte van het in Fig. 2 weergegeven flow-diagram;Fig. 3 shows in a flow diagram a first embodiment of the core portion of the structure shown in FIG. 2 flow diagram shown;

Fig. 4 in een flow-diagram een andere uitvoeringsvorm toont van het in Fig. 3 getoonde kerngedeelte;Fig. 4 shows in a flow diagram another embodiment of the system shown in FIG. 3 core section shown;

Fig. 5 in een flow-diagram weer een andere uitvoeringsvorm toont van het in Fig. 3 getoonde kerngedeelte;Fig. 5 shows in a flow diagram yet another embodiment of the system shown in FIG. 3 core section shown;

Fig. 6 een gedetailleerd flow-diagram toont van een eerste, gemeenschappelijk gedeelte uit het in de figuren 3 tot en met 5 weergegeven kerngedeelte;Fig. 6 shows a detailed flow diagram of a first common portion from the core portion shown in FIGS. 3 to 5;

Fig. 7 een gedetailleerd flow-diagram toont van een eerste gedeelte van het in Fig. 6 weergegeven gemeenschappelijk gedeelte;Fig. 7 shows a detailed flow diagram of a first portion of the structure shown in FIG. 6 common part shown;

Fig. 8 een gedetailleerd flow-diagram toont van een tweede gedeelte van het in Fig. 6 weergegeven gemeenschappelijk gedeelte;Fig. 8 shows a detailed flow diagram of a second portion of the structure shown in FIG. 6 common part shown;

Fig. 9 een gedetailleerde uitvoeringsvorm toont van een tweede, gemeenschappelijk gedeelte uit het in de figuren 3 tot en met 5 weergegeven kerngedeelte;Fig. 9 shows a detailed embodiment of a second common portion from the core portion shown in FIGS. 3 to 5;

Fig. 10 een flow-diagram toont van een toepassingsgericht gedeelte uit het in Fig. 2 weergegeven flow-diagram;Fig. 10 shows a flow diagram of an application-oriented part of the system shown in FIG. 2 flow diagram shown;

Fig. 11 in een flow-diagram een gedetailleerde weergave toont van een gedeelte van het in Fig. 10 weergegeven flow-diagram, en Fig. 12 in een flow-diagram een modificatie toont van een gedeelte van het in Fig. 7 weergegeven flow-diagram.Fig. 11 in a flow chart shows a detailed view of part of the structure shown in FIG. 10 flow chart shown, and FIG. 12 in a flow chart shows a modification of a portion of the structure shown in FIG. 7 flow diagram shown.

Een parser verschaft de middelen om een zin, gesteld in een natuurlijke taal, om te zetten in een syntactische representatie met behulp van een formalisme, waarin de relaties tussen de verschillende entiteiten tot uiting worden gebracht. Dat formalisme is gebaseerd op de gedachte, dat het mogelijk is om de op zich zelf staande zinsdelen (constituenten) van een zin middels een hiërarchische structuur, zoals een vertakkingsnetwerk (boomstructuur) te kunnen representeren.A parser provides the means to convert a sentence, expressed in a natural language, into a syntactic representation using a formalism, which expresses the relationships between the different entities. That formalism is based on the idea that it is possible to represent the self-contained phrases (constituents) of a sentence through a hierarchical structure, such as a branching network (tree structure).

De invoer van de parser omvat strikt genomen niet een zinsregel, maar een reeks woorden, welke door lexicalisering van de zin is verkregen. Zoals in Fig. 1 is weergegeven houdt dit in, dat iedere, bij program-mafase 1 aangeboden zin eerst bij programmafase 2 gelexicaliseerd moet worden alvorens bij programmafase 3 de ontleding van de zin door de parser zal plaats vinden. Bij het lexicaliseren van een zin wordt van ieder woord met behulp van een zogenaamd lexicaal geheugen, dat deel uitmaakt van het computergeheugen of bijvoorbeeld op geheugenschijf (disk) aanwezig is, de bijbehorende lexicale informatie met betrekking tot de mogelijke verbale categorieën en de applicatie-gerichte categorieën opgezocht en/of gegenereerd, en wordt deze informatie middels één of meerdere woordstructuren in een woordgeheugen, dat eveneens deel uitmaakt van het computergeheugen, opgeslagen. De resultaten van het lexicalisatie proces van een woordeenheid ten aanzien van de applicatie-gerichte categorieën zullen hierna onder de term "Kenmerken" worden bijeengebracht. In het verloop van de beschrijving zal besproken worden waarom er sprake kan zijn van meer dan één woordstructuur, maar eerst zal het begrip woordstructuur nader worden uiteengezet.Strictly speaking, the input of the parser does not comprise a sentence line, but a series of words obtained by lexicalisation of the sentence. As shown in Fig. 1, this means that every sentence presented at program phase 1 must first be lyxized at program phase 2 before the parser parses the sentence at program phase 3. When lexicalizing a sentence, each word is called lexical information related to the possible verbal categories and the application-oriented information of each word by means of a so-called lexical memory, which is part of the computer memory or which is present on a memory disk (disk). categories are searched for and / or generated, and this information is stored by means of one or more word structures in a word memory, which is also part of the computer memory. The results of the lexicalization process of a word unit with respect to the application-oriented categories will be collected below under the term "Features". In the course of the description it will be discussed why there may be more than one word structure, but first the term word structure will be explained in more detail.

Een woordstructuur is een samenstel van gegevensbestanddelen bij een woord, waarbij aan die gegevensbestanddelen een structuur met een aantal geheugenvelden wordt toegekend. De woordstructuur omvat de volgende velden: een Categorie-veld (Category field), een Opdracht-veld (Command field), een Rij-veld (String field) en een Kenmerken- veld (Features field). Het Categorie-veld bevat de in het lexicaal geheugen gevonden woordcategorie, waarvoor bijvoorbeeld de volgende woordaanduidingen in aanmerking komen: Lidwoord ("article"), zelfstandig (gebruikt) naamwoord ("noun"), voornaamwoord ("pronoun"), werkwoord ("verb"), voorzetsel ("P-word"), voegwoord ("C-word"), bijvcegelijk naamwoord ("adjective") of bijwoord ("adverb"). Dit geldt slechts als voorbeeld en is afhankelijk van het gebruikte woordenboek en grammaticatabel1 en.A word structure is an assembly of data elements associated with a word, whereby those data elements are assigned a structure with a number of memory fields. The word structure includes the following fields: a Category field, a Command field, a Row field, and a Features field. The Category field contains the word category found in lexical memory, for which the following word indications are eligible, for example: Article ("article"), noun ("noun"), pronoun ("pronoun"), verb (" verb "), preposition (" P-word "), conjunction (" C-word "), adjective (" adjective ") or adverb (" adverb "). This is only an example and depends on the dictionary and grammar table1 and used.

Het Opdracht-veld is in eerste instantie leeg, maar zal later door de parser worden ingevuld. De betekenis van het Opdracht-veld zal verder worden uitgelegd bij de beschrijving van een andere structuur, namelijk de constituentstructuur, waarbij ook een dergelijk Opdracht-veld aanwezig is.The Command field is initially empty, but will be filled in later by the parser. The meaning of the Command field will be further explained in the description of another structure, namely the constituent structure, in which such a Command field is also present.

Het Rij-veld is gevuld met een "String,,-representatie van het Woord, d.w.z. een lineaire rangschikking van de elementen van het Woord. Het Kenmerken-veld bevat de overige Woordkenmerken, d.w.z. die woordken-merken, welke tijdens het lexical isatieproces zijn verkregen maar niet in het Categorie-veld zijn opgenomen. De in het Kenmerken-veld opgenomen woordkenmerken zijn van functionele aard.The Row field is filled with a "String" representation of the Word, ie a linear arrangement of the elements of the Word. The Attributes field contains the other Word attributes, ie those word attributes, which are used during the lexical isation process. obtained but not included in the Category field The word attributes included in the Characteristics field are of a functional nature.

Het bovenstaande zal aan de hand van een tweetal voorbeelden worden toegelicht: Bij het woord "huis" wordt in Categorie-veld ingevuld: "zelfstandig (gebruikt) naamwoord" (noun); in het Opdracht-vel d: "NIL"; in het Rij-veld: "huis" en in het Kenmerken-veld: "3e persoon enkelvoud" (sing 3) en "onzijdig" (neuter).The above will be explained on the basis of two examples: The word "house" in Category-field is filled in: "noun (noun)" (noun); in the Command sheet d: "NIL"; in the Row field: "house" and in the Characteristics field: "3rd person singular" (sing 3) and "neuter" (neuter).

Een dergelijke representatie van een woordstructuur heeft tot gevolg, dat een woord als "regent" een Categorie-veld met de invulling "werkwoord" (verb) alswel een Categorie-veld met de invulling "zelfstandig (gebruikt) naamwoord" (noun) met bij elk Categorie-veld een eigen Kenmerken-veld verkrijgt.As a result of such a representation of a word structure, a word such as "regent" has a Category field with the entry "verb" (verb) as well as a Category field with the entry "noun" (noun) with bij each Category field obtains its own Attributes field.

Daar aldus een Woord in meerdere woordcategorieën kan voorkomen, en voorts als gevolg van niet-eenduidige relaties tussen Constituenten meerdere zinstructuren tussen deze zinsdelen denkbaar zijn, leidt deze aanpak tot meerdere vormen van boomstructuren, en dus tot meerdere syntactische representaties van de ontlede zin. De parser zal proberen steeds iedere woordstructuur van het huidige woord binnen iedere open constituentstructuur onder te brengen. Dit komt er op neer, dat de parseersnelheid afneemt, als er veel representaties aanwezig zijn.Since a Word can thus occur in several word categories, and furthermore as a result of ambiguous relations between Constituents, multiple sentence structures between these phrases are conceivable, this approach leads to multiple forms of tree structures, and thus to multiple syntactic representations of the parsed sentence. The parser will always try to include every word structure of the current word within every open constituent structure. This means that the parsing speed decreases when many representations are present.

Daar achteraf toch niet alle representaties geschikt blijken te zijn, is het de doelstelling van de uitvinding na te gaan, op welke wijze tussentijds, dat wil zeggen, gedurende de werkzame fase van de parser, één of meer minder aantrekkelijke representatievormen van zinsdeel structuren geëlimineerd kunnen worden. Het uitgangspunt daarbij is, dat na iedere fase waarin een woord aan de bestaande represen-tatievorm is toegevoegd, de daarbij betrokken zijnde Constituent aan bepaalde syntactische regels wordt onderworpen, en op grond van zekere foutdetecties de initiële waarschijnlijkheidsfactor, welke de Constituent is toegekend, volgens een stelsel van onderscheidenlijke correc-tiefactoren wordt verlaagd, en vervolgens de gecorrigeerde waarschijnlijkheidsfactor wordt getoetst aan een zekere drempelwaarde. Alle representatievormen van Constituenten met een waarschijnlijksfac-tor, welke kleiner is dan die drempelwaarde, kunnen dan als "minder waarschijnlijk voorkomend" worden geëlimineerd. Vervolgens wordt deze werkwijze herhaald om het volgende woord binnen de representatiestructuur te kunnen opnemen.Since in hindsight not all representations appear to be suitable, it is the object of the invention to investigate how, in the interim, that is to say, during the active phase of the parser, one or more less attractive forms of representation of phrase structures can be eliminated. turn into. The starting point is that after each phase in which a word is added to the existing representation form, the Constituent involved is subjected to certain syntactic rules, and on the basis of certain error detections the initial probability factor, which is assigned the Constituent, according to a system of respective correction factors is lowered, and then the corrected probability factor is tested against a certain threshold value. All forms of Constituent representation with a probability factor that is less than that threshold can then be eliminated as "less likely". This procedure is then repeated to include the next word within the representation structure.

Een dergelijke actie kan zowel direct na het opnemen van een woord in een bestaande of nieuwe constituent plaatsvinden, maar kan ook na een daarop volgend onderzoek betreffende het afsluiten van die constituent plaatsvinden. Bij de alhier te beschrijven parser zal een dergelijke filteractie in beide situaties plaatsvinden.Such an action can take place immediately after the inclusion of a word in an existing or new constituent, but can also take place after a subsequent investigation into the conclusion of that constituent. With the parser to be described here, such a filtering action will take place in both situations.

Hoewel de lexicalisatie zoals hier beschreven vóór de eigenlijke ontleding plaatsvindt zodat het parseerproces eerst begint nadat alle woorden zijn gelexicaliseerd, is het ook mogelijk om steeds een woord te lexicaliseren en vervolgens dat woord te parseren, het volgende woord - zo mogelijk tussentijds - te lexicaliseren en dan te parseren etc. Een voordeel van een dergelijke aanpak zou kunnen zijn dat het dan mogelijk is om reeds te beginnen met een zin te parseren terwijl een gebruiker nog de zin intikt (real time parsing).Although the lexicalization as described here takes place before the actual decomposition so that the parsing process starts only after all words have been lexicalized, it is also possible to lexicalize one word at a time and then parse that word, lexicalize the next word - if possible in the meantime - and then parsing etc. An advantage of such an approach could be that it is then possible to start parsing a sentence while a user is still typing the sentence (real time parsing).

Maar in het hierna te beschrijven flowdiagram wordt uitgegaan van een lexicalisatie welke aan het eigenlijke parseerproces vooraf gaat. Na de lexical isatie wordt dan de parser aangeroepen. Als deze bij stap 4 erin slaagt (Y) om een analyse te vinden, kan van deze analyse in de rest van het programma (stap 5) gebruik worden gemaakt. Dit kan een bomentekenprogramma zijn, waarbij een grafische weergave van de gevonden ontleding wordt getekend, een editor, een indexeer- en retrieval systeem etc., waarna verder gegaan kan worden met de volgende zin bij stap 1. Leidt stap 4 echter niet tot een bruikbaar resultaat (N), dan is de parser er niet in geslaagd een goede analyse te vinden. Dit zal er op neerkomen, dat de zin zeer ongrammaticaal was. Hiervoor kan eventueel een actie worden ondernomen, waarna vervolgens kan worden verder gegaan met de volgende zin.However, the flow diagram to be described below assumes a lexicalization that precedes the actual parsing process. The parser is then called after the lexical isation. If it succeeds (Y) in finding an analysis in step 4, this analysis can be used in the rest of the program (step 5). This can be a tree drawing program, in which a graphical representation of the found decomposition is drawn, an editor, an indexing and retrieval system, etc., after which the next sentence can be continued in step 1. However, step 4 does not lead to a usable result (N), the parser failed to find a good analysis. This will mean that the sentence was very ungrammatical. An action can be taken for this, after which the next sentence can be continued.

Alvorens met de eigenlijke beschrijving van de parser (zie stap 3) te beginnen, is het van belang om nog een andere structuur, namelijk de Constituentstructuur te beschrijven. De Constituentstructuur is een op een zinsdeel betrekking hebbend samenstel van gegevensbestanddelen, waarbij de volgende geheugenvelden behoren: Categorieveld (Category field), Opdrachtveld (Command field), Structuurelementenveld (Members field), Kenmerkenvel d (Features field), Stapelveld (Stack field), Foutmeldingsveld (Yiolations field) en een Waarschijnlijkheidsveld (Probability field).Before starting with the actual description of the parser (see step 3), it is important to describe another structure, namely the Constituent structure. The Constituent Structure is a phrase-related assembly of data items, which includes the following memory fields: Category field, Category field, Command field, Members field, Features field d, Stack field, Error reporting field (Yiolations field) and a Probability field.

Het Categorieveld is gereserveerd voor informatie over de categorie van de constituent, waartoe in onze voorbeeldgrammatica onder andere de volgende zinsdelen gerekend moeten worden: hoofd-, neven- of bijzin (Sentence met de afkorting S), de naamwoordelijke frase of -zinsdeel (nounphrase met de afkorting NP), de voorzetsel groep of -zinsdeel (Prepositional phrase met de afkorting PP) en het adjectieve/adverbiale zinsdeel (adjective/adverbial phrase met de afkorting AP). Het Categorie veld bevat aanvankelijk de aanduiding NIL, maar zal tijdens het parseerproces worden ingevuld met de informatie over de constituent. Het Opdrachtveld is het veld dat na het parseren adresinformatie (pointer) bevat over de Parentconstituent, d.w.z. de Constituent welke de onderhavige constituent domineert.The Category field is reserved for information about the category of the constituent, which in our example grammar should include the following phrases: main, sub- or clause (Sentence with the abbreviation S), the nominal phrase or phrase (nounphrase with the abbreviation NP), the preposition group or phrase (Prepositional phrase with the abbreviation PP) and the adjective / adverbial phrase (adjective / adverbial phrase with the abbreviation AP). The Category field initially contains the designation NIL, but will be filled in with the information about the constituent during the parsing process. The Command field is the field that contains, after parsing, address information (pointer) about the Parent constituent, i.e., the Constituent which dominates the present constituent.

Het Structuurelementenveld is gereserveerd voor informatie over een of meer paarsgewijs voorkomende combinaties van het functionele label bij een, tot deze constituent behorende Woord- of Constituentstructuur, alsmede de adresinformatie over de desbetreffende Woord- of Consti-tuentstructuur. In het algemeen kan een Constituent zijn opgebouwd uit een aantal Woordstructuren en/of Constituentstructuren.The Structural Elements field is reserved for information about one or more pairs of the functional label that occur in pairs with a Word or Constituent structure belonging to this constituent, as well as the address information about the relevant Word or Constituent structure. In general, a Constituent can be composed of a number of Word structures and / or Constituent structures.

De hier in de voorbeeldgrammatica gebruikte functionele labels zijn onder andere de volgende: - Subj : Onderwerp (Subject) - Indobj : Meewerkend voorwerp (Indirect Object) - Obj : Lijdend voorwerp (Object) - Smod : Zinsmodificatie (Sentence Modification) ook wel bekend als een "nadere bijwoordelijke bepaling" (Adverbia!The functional labels used here in the sample grammar include the following: - Subj: Subject (Subject) - Indobj: Contributing Object (Indirect Object) - Obj: Direct Object (Object) - Smod: Sentence Modification (Sentence Modification) also known as a "further adverbial provision" (Adverbia!

Modi fi cation ) - Comp : Complement - Pred : Predicaat (Predicate) - fNP : functionele NP, te gebruiken als een voorlopig label, dat later door een definitief functioneel label op zinsniveau vervangen gaat worden - Nmod-a : een modificerende bepaling bij een zelfstandigModi fi cation) - Comp: Complement - Pred: Predicate (Predicate) - fNP: functional NP, to be used as a provisional label, which will later be replaced by a definitive functional label at sentence level - Nmod-a: a modifying provision for a independent

(gebruikt) naamwoord in de vorm van een AP(used) noun in the form of an AP

- Nmod-s : een modificerende bepaling bij een zelfstandig- Nmod-s: a modifying provision for a self-employed person

(gebruikt) naamwoord in de vorm van een S(used) noun in the form of an S

- Nmod-p : een modificerende bepaling bij een zelfstandig- Nmod-p: a modifying provision for a self-employed person

(gebruikt) naamwoord in de vorm van een PP(used) noun in the form of a PP

- Oet : een bepalend voornaamwoord of lidwoord (determiner)- Oet: a determining pronoun or article (determiner)

- Head : centraal element binnen een NP, PP of AP- Head: central element within an NP, PP or AP

- Pobj : een, op een voorzetsel aansluitend gedeelte, gereali- Pobj: a section, connecting to a preposition

seerd door een NPsated by a POI

- Amod : een modificatie van het hoofdwoord van een AP- Amod: a modification of the main word of an AP

- Seq : een verbindingswoord met nevenschikkende strekking tussen twee zinsdelen of zinnen (sequencer) - Conj : een van de nevengeschikte zinsdelen.- Seq: a connecting word with a juxtaposition between two phrases or sentences (sequencer) - Conj: one of the subordinate phrases.

Het Kenmerkenveld is gereserveerd voor de informatie over de bij de Constituent behorende kenmerken. Zodra een Constituent wordt aangemaakt met een bepaalde Categorie, zal het Kenmerkenvel d gevuld worden met de bij de Categorie behorende kenmerken. In geval van een Woordcategorie, wordt het Kenmerkenveld voorzien van de bij dat Woord behorende kenmerken, welke tijdens het lexialisatieproces zijn gevonden. Tijdens het parseerproces zal voor ieder element (woord of Constituent) dat tot de betreffende Constituent gerekend moet worden, het Kenmerkenveld van de betreffende Constituent aan de bij dat element behorende informatie worden aangepast. Het Stapelveld bevat een opsomming van de functionele labels, welke eveneens in het Struc-tuurelementenveld van de Constituent vermeld staan. Het Foutmeldings-veld bevat informatie over alle foutcodes met betrekking tot onjuiste grammatica!iteit, die binnen de Constituent gemaakt zijn. In eerste instantie is dit veld leeg.The Characteristics field is reserved for the information about the characteristics belonging to the Constituent. As soon as a Constituent is created with a certain Category, the Attribute Sheet d will be filled with the attributes belonging to the Category. In the case of a Word category, the Attributes field is provided with the attributes associated with that Word, which have been found during the lexialization process. During the parsing process, for each element (word or Constituent) that must be included in the relevant Constituent, the Characteristic field of the relevant Constituent will be adapted to the information associated with that element. The Stapelveld contains a list of the functional labels, which are also mentioned in the Structural Elements field of the Constituent. The Error Reporting field contains information about all error codes related to incorrect grammar made within the Constituent. This field is initially empty.

Het Waarschijnlijkheidsveld geeft de waarschijnlijkheidsfactor aan, die aan het voorkomen van deze Constituent wordt toegekend. In eerste instantie is dit "1", maar voor iedere gedetecteerde onjuiste gram-maticaliteit zal deze waarde afnemen met een waarde die afhankelijk is van het gedetecteerde type van zo'n onjuiste granmaticaliteit.The Probability field indicates the probability factor attributed to the occurrence of this Constituent. Initially this is "1", but for each detected incorrect gram materiality, this value will decrease by a value which depends on the detected type of such incorrect granularity.

Met behulp van de Constituenten is het mogelijk om een parserboom samen te stellen. Iedere Constituent kan immers op zijn beurt weer Constituenten en/of Woorden bevatten. Het Structuurelementenveld geeft daarbij behalve de functionele categorie ook de bij de Constituent- of Woordstructuren behorende geheugenadresinformatie aan. Aan de hand van deze informatie is het mogelijk om af te dalen in de parseerboom. Omgekeerd verschaft het Opdrachtveld van een Constituent adresinformatie over diens Parent constituent, d.w.z. de die andere Constituent dominerende Constituent hetgeen de mogelijkheid biedt om hoger in de parseerboom te geraken. Opgemerkt wordt, dat Woorden geen structuurelementen kunnen omvatten, terwijl het Opdrachtveld van de hoogste Constituent (root) in de parseerboom geen informatie (NIL) bevat.With the help of the Constituents it is possible to compose a parser tree. After all, every Constituent can in turn contain Constituents and / or Words. In addition to the functional category, the Structural Element field also indicates the memory address information associated with the Constituent or Word structures. On the basis of this information it is possible to descend into the parsing tree. Conversely, the Command field of a Constituent provides address information about its Parent constituent, i.e. the other Constituent dominating Constituent, which provides the opportunity to get higher in the parsing tree. It is noted that Words cannot contain structural elements, while the Command field of the highest Constituent (root) in the parsing tree contains no information (NIL).

De parser werkt, zoals aan het Waarschijnlijkheidsveld al in zekere mate te zien is, met waarschijnlijkheidsfactoren en bezit een zonodig te modificeren drempelwaarde, waaraan alle beschikbare structuren op hun waarschijnlijkheidsfactor worden getoetst, en waarbij alleen structuren met een waarschijnlijkheidsfactor, die groter of gelijk is aan de drempel waarde, door de parser worden verwerkt. Hierdoor is het mogelijk om sneller de ontleding van grammaticale en bij na -grammaticale zinnen uit te voeren.The parser, as can already be seen to some extent from the Probability field, works with probability factors and has a threshold value that can be modified if necessary, against which all available structures are tested for their probability factor, and where only structures with a probability factor greater than or equal to the threshold value, to be processed by the parser. This makes it possible to perform the decomposition of grammatical and post-grammatical sentences more quickly.

In het volgende gedeelte zal de werking van de parser nader worden uiteengezet, waarbij het van belang is om hier te vermelden, dat de parser berust op het inzicht, dat een Woord van een zekere Categorie binnen een Constituent van een bepaalde Categorie in veel voorkomende gevallen een eenduidig bepaalde functionele betekenis binnen de Constituent zal hebben. Hetzelfde geldt voor Constituenten binnen Constituenten. De wijze van parseren is dan ook gebaseerd op het gegeven, dat het meestal voldoende zal zijn om de Categorieën van de Parent Constituent enerzijds, en die van het Woord of Constituent anderzijds te kennen teneinde de functie (functionele categorie) van dat Woord of Constituent binnen de Parentconstituent te bepalen.In the following section, the operation of the parser will be explained in more detail, it is important to note here that the parser is based on the understanding that a Word of a certain Category within a Constituent of a certain Category often occurs will have a clearly defined functional meaning within the Constituent. The same goes for Constituents within Constituents. The parsing method is therefore based on the fact that it will usually be sufficient to know the Categories of the Parent Constituent on the one hand, and those of the Word or Constituent on the other in order to know the function (functional category) of that Word or Constituent within determine the Parent Constituent.

Een bij de parser behorend programma is in de navolgende figuren middels een flow-diagram weergegeven, volgens welke de werkwijze van het parseren van een tekst in een computer wordt uitgevoerd.A program associated with the parser is shown in the following figures by means of a flow diagram, according to which the method of parsing a text in a computer is carried out.

De gehele parser zal eerst op een vrij globale manier worden beschreven. Daarbij zullen een tweetal programmadelen, te weten het zogenaamde "rechts inpassen van een woord in een bestaande of nieuwe Constituent" ("word right associate") en het "afsluiten van een Constituent" ("closure") slechts summier worden aangegeven, waarna in een daarop volgend gedeelte van de beschrijving deze programmadelen aan de hand van gedetailleerde figuren nader worden uiteengezet.The entire parser will first be described in a fairly global manner. Two program parts, the so-called "fitting a word in an existing or new Constituent" ("word right associate") and the "closing a Constituent" ("closure") will only be indicated briefly, after which In a subsequent part of the description, these program parts are explained in more detail with reference to detailed figures.

De te beschrijven werkwijze kan in twee fasen worden verdeeld, waarbij in een eerste fase, welke in feite het eigenlijke parseerproces omvat, de mogelijke zinsrepresentaties middels een syntactische analyse worden gevormd, en waarbij in de tweede (additieve) fase definitieve functionele labels aan constituenten met een nog onduidelijk functioneel karakter worden toegekend. In beide fasen zullen daartoe geëigende fi1terprocedures worden besproken.The method to be described can be divided into two phases, in which in the first phase, which in fact comprises the actual parsing process, the possible sentence representations are formed by means of a syntactic analysis, and in the second (additive) phase definitive functional labels on constituents with still have an unclear functional character. Appropriate filter procedures will be discussed in both phases.

Deze werkwijze wordt per woord (met al zijn gebruiksfuncties) uitgevoerd en zal telkenmale, wanneer een volgend woord binnen de representatiestructuur moet worden opgenomen, opnieuw worden uitgevoerd.This method is carried out word by word (with all its functions of use) and will be repeated each time when a next word is to be included within the representation structure.

In het flow-diagram van Fig. 1 vangt de inleesfase van een zin om de werkwijze van de parser te kunnen uitvoeren aan met verwij-zingscijfer 1. In deze fase wordt de te ontleden zin op bekende wijze (bijvoorbeeld middels een toetsenbord) de computer aangeboden, en wordt de zin woord voor woord in een als woordgeheugen aan te duiden geheugendeel in de computer geschreven, waarbij tevens het aantal woorden (kmax), waaruit de zin is opgebouwd wordt berekend. Daarbij geldt als richtlijn, dat de woorden van elkaar worden gescheiden door spaties, tabulaties en leestekens, en dat ieder leesteken tot een woordcategorie wordt gerekend. Het genoemde aantal kmax wordt eveneens in het woordgeheugen bij gehouden.In the flow diagram of Fig. 1 commences the reading phase of a sentence in order to be able to carry out the parser's method with reference numeral 1. In this phase, the sentence to be parsed is presented to the computer in a known manner (for example by means of a keyboard), and the sentence word for word written in a memory part to be designated as word memory in the computer, wherein the number of words (kmax) of which the sentence is composed is also calculated. The guideline here is that the words are separated by spaces, tabulations and punctuation marks, and that each punctuation mark is included in a word category. The stated number of kmax is also kept in the word memory.

Vervolgens vindt de volgende fase (2), de lexicale omschrijving van alle woorden uit de aangeboden zin, plaats. Daarbij worden aan de woorden van de aangeboden zin de per woord aanwezige woordstructuren opgezocht, hetgeen geschiedt aan de hand van een als lexicaal geheugen aan te duiden geheugendeel, dat in het comouteraeheuaen aanwezig is, en waarin een grote verzameling met één of meer mogelijk voorkomende woordstructuren ligt opgeslagen.Then the next phase (2), the lexical description of all words from the offered sentence, takes place. The word structures present per word are searched for by the words of the sentence offered, which is done on the basis of a memory part, which can be designated as lexical memory, which is present in the comouteraeheuaen, and in which a large collection with one or more word structures possibly occurring is stored.

Een gedetai11 eerde uiteenzetting van de fase 2 zal aan de hand van Fig. 2 geschieden. Daarin vindt de lexicale omschrijving plaats in de volgende programmastap 6. Daarbij wordt een teleenheid (k-telier), welke het rangnummer van het te behandelen woord (k) aangeeft, teruggesteld naar 0.A detailed explanation of phase 2 will be described with reference to FIG. 2 occur. The lexical description takes place in the next program step 6. A counting unit (k-telier), which indicates the rank number of the word to be treated (k), is reset to 0.

Vervolgens wordt bij stap 7 de telstand van de k-teller met 1 opgehoogd (k = k+1), waarna bij stap 8 onderzocht wordt of de telstand van de k-teller reeds de maximale k-waarde (kmax) overschreden heeft (k > kmax). Is dit het geval (Y), dan gaat het programma verder naar stap 11, zo niet (N), dan gaat het programma naar stap 9, waarbij het volgende woord uit het woordgeheugen middels een door de telstand van de k-teller bepaald adres wordt gehaald en naar een als woordgeheugen aan te duiden geheugendeel in de computer wordt geschreven. Voorts wordt bij stap 10 het in het woordgeheugen geschreven woord in het lexicaal geheugen teruggezocht en de aldaar voorkomende specifieke informatie bij het woord in het woordgeheugen bijgeschreven.Then, in step 7, the count of the k-counter is increased by 1 (k = k + 1), after which in step 8 it is examined whether the count of the k-counter has already exceeded the maximum k-value (kmax) (k > kmax). If this is the case (Y), the program proceeds to step 11, if not (N), the program proceeds to step 9, whereby the next word from the word memory is assigned an address determined by the count of the k-counter. is retrieved and written to a memory section to be designated as word memory in the computer. Furthermore, in step 10, the word written in the word memory is searched for in the lexical memory and the specific information occurring there is added to the word in the word memory.

Daarna keert het programma terug naar stap 7. Daar bij zinsontleding ieder woord op alle vermelde woordfuncties (woordcategorieën) onderzocht dient te worden, zal aan ieder woord met een afwijkende woordcategorie en/of andere afwijkende kenmerken een eigen rangnummer (1) aan de bijbehorende Woordstructuur worden toegekend, terwijl tevens het aantal toegekende rangnummers (lmaXj k) Per w°ord (k) zal worden bijgehouden. In plaats van (1 rnax, k^ *ian °°k volstaan worden met de toekenning van een specifiek label aan het laatste array om daarmee bij de hierna volgende syntactische analyse aan te geven, dat de laatste woordfunctie van dat woord bereikt is.Then the program returns to step 7. Since in parsing each word must be examined for all listed word functions (word categories), each word with a deviating word category and / or other deviating characteristics will have its own rank number (1) in the corresponding Word structure while the number of assigned rank numbers (lmaXj k) Per w ° ord (k) will also be kept. Instead of (1 rnax, k ^ * ian ° ° k suffice to assign a specific label to the last array to indicate in the syntactic analysis below that the last word function of that word has been reached.

Tijdens de in de fase 3 uit te voeren syntactische analyse kunnen meerdere representaties met onderscheidenlijke constituentstructuren gelijktijdig als mogelijke oplossingen voorkomen, welke op grond van strijdigheid met zekere grammaticaregels verschillende waarschijnlijkheidsfactoren bezitten. Daar men slechts geïnteresseerd is in de oplossing of serie oplossingen met de hoogste waarschijnlijkheidsfactor, dienen oplossingen met een lagere waarschijnlijkheidsfactor als een niet-interessante oplossing te worden gekwalificeerd en dientengevolge te worden geëlimineerd. Dit geschiedt door toetsing van de bij die oplossingen behorende waarschijnlijkheidsfactoren aan de hand van een zekere drempelwaarde. Deze drempel waarde wordt bij de aanvang van de syntactische analyse op de waarde "1" gesteld, maar kan tijdens die analyse naar een lagere waarde worden bij gesteld, indien blijkt, dat geen der verkregen waarschijnlijkheidsfactoren aan de drempelwaarde voldoen. Dienovereenkomstig omvat het programma een stap 11, waarbij de initiële drempel waarde op "1" wordt ingesteld.During the syntactic analysis to be carried out in phase 3, multiple representations with respective constituent structures may simultaneously appear as possible solutions, which have different probability factors due to conflict with certain grammar rules. Since one is only interested in the solution or series of solutions with the highest probability factor, solutions with a lower probability factor should be classified as an uninteresting solution and should therefore be eliminated. This is done by testing the probability factors associated with these solutions on the basis of a certain threshold value. This threshold value is set to the value "1" at the start of the syntactic analysis, but can be adjusted to a lower value during that analysis, if it appears that none of the probability factors obtained meet the threshold value. Accordingly, the program includes a step 11, wherein the initial threshold value is set to "1".

Vervolgens wordt bij stap 12 een beginconstituent voor de zin gemaakt. Dit is een Constituent, waarvan in de aanvangsfase de velden de volgende informatie omvatten:Then, at step 12, an initial constituent for the sentence is made. This is a Constituent, the initial stages of which include the following information:

Categorieveld : SCategory field: S

Opdrachtveld : NILCommand field: NIL

Structuurelementenveld : MILStructural element field: MIL

Kenmerkenveld : MILCharacteristic field: MIL

Stapelveld : MILStacking field: MIL

Foutmeldingsveld : MILError message field: MIL

Waarschijnlijkheidsveld: 1Probability field: 1

Voor de invulling van het Kenmerkenveld wordt in een als eerste tabellarisch geheugen aan te duiden geheugendeel (zie tabel A) gekeken naar de onder categorie S aangeven Kenmerken. Daar in deze vóórheel dgrammatica hierbij geen Kenmerken zijn aangegeven, blijft het bijbehorend veld leeg (NIL). Het geheugen omvat voorts nog een zogenaamd Representatiegeheugen voor opname van actuele Constituenten (bijvoorbeeld in lijstvorm als gebruikelijk is bij de programmeertaal LISP), waarmee de parser steeds zal werken. In dit Representatiegeheugen, welk tot nog toe leeg was, wordt de beginconstituent S ingeschreven.In order to fill in the Characteristics field, a memory section to be designated as the first tabular memory (see table A) looks at the Characteristics indicated under category S. Since no Characteristics are indicated in this preprogramme, the corresponding field remains empty (NIL). The memory also comprises a so-called Representation memory for recording current Constituents (for example in list form as is usual with the programming language LISP), with which the parser will always work. In this Representation Memory, which has hitherto been empty, the initial constituent S is written.

Vervolgens wordt bij stap 13 de k-teller teruggesteld naar de aan-vangswaarde "0", en wordt bij stap 14 de tel stand van de k-teller met "1" verhoogd. Ma stap 14 komt het programma terecht bij stap 15, waarbij wordt onderzocht of alle woorden (inclusief de leestekens) in de zin al bij de syntactische analyse betrokken zijn geweest en of derhalve voor de telstand van de k-teller geldt: k > kmax. Is dit het geval (Y) dan is blijkbaar ieder Woord in de Constituenten ondergebracht en dient de syntactische analyse als afgerond te worden beschouwd.Then, at step 13, the k-counter is reset to the initial value "0", and at step 14, the count of the k-counter is increased by "1". After step 14, the program arrives at step 15, which examines whether all words (including the punctuation marks) in the sentence have already been involved in the syntactic analysis and whether the count of the k-counter therefore applies: k> kmax. If this is the case (Y) then apparently every Word is included in the Constituents and the syntactic analysis should be considered as completed.

Is bij stap 15 de uitspraak k > kmax niet waar (N), dan gaat het programma verder naar stap 15. Daarbij zal het inpassen van het met behulp van de k-teller aangewezen woord in de op dat moment beschikbare Constituentstructuren een aanvang nemen, en wel in de nog niet afgesloten zijnde Constituentstructuur, die zich in het bovengenoemde representatiegeheugen bevindt.If in step 15 the statement k> kmax is not true (N), the program proceeds to step 15. The fitting of the word indicated with the aid of the k-counter in the Constituent structures available at that time will start, in the not yet completed Constituent structure, which is located in the aforementioned representation memory.

Van dit woord staan nu alle Woordstructuren (1) als gevolg van het 1exicalisatieproces ter beschikking. Bij het inpassen van een Woord met een zekere woordstructuur binnen een Constituent met een zekere Categorie zal het Woord in veel gevallen een bepaalde functie vervullen, en kan dus een zekere functionele Woordcategorie worden toegekend. Hetzelfde zal in een later stadium ook gelden voor Constituenten binnen hun Parentconstituent. Op grond van de Constituentcategorie enerzijds en de Woordcategorie anderzijds zal de functie (functionele Categorie), van het Woord binnen de Constituent worden bepaald. Is de omschreven inpassing van het Woord in bedoelde Constituent niet rechtstreeks mogelijk, dan is het soms mogelijk dat daaraan een nieuwe Constituent gekoppeld kan worden, waarbinnen de opname van het Woord mogelijk is. Hierbij kan nog worden opgemerkt, dat het opnemen van een Woordstructuur in een open Constituentstructuur op grond van een aantal grammaticaregels en vanwege het aantal Woordstructuren (1 max, k) bij dat Woord (k) een zekere verveelvoudiging van het aantal Consti-tuentstructuren kan opleveren. Daar de resultaten van het parseerpro-ces steeds in het Representatiegeheugen zullen worden opgenomen, en de actuele daarin voorkomende Constituentstructuren als ingangswaarden dienen te worden genomen voor het parseerproces, wordt bij stap 17 eerst een als "Tijdelijk Geheugen" aan te merken geheugendeel van de computer leeggemaakt, wordt vervolgens het Representatiegeheugen met al zijn Constituentstructuren bij stap 18 naar het Tijdelijk Geheugen overgeschreven, en wordt daarna bij stap 19 het Representatiegeheugen leeggemaakt. Van het navolgende programmagedeelte 20 zijn meedere varianten mogelijk. Deze zullen achtereenvolgens aan de hand van de figuren 3, 4 en 5 worden uiteengezet.All Word structures (1) of this word are now available as a result of the exicalization process. When fitting a Word with a certain word structure within a Constituent with a certain Category, the Word will in many cases fulfill a certain function, and thus a certain functional Word Category can be assigned. The same will also apply to Constituents within their Parent Constituent at a later stage. On the basis of the Constituent Category on the one hand and the Word Category on the other hand, the function (functional Category) of the Word within the Constituent will be determined. If the described incorporation of the Word into the aforementioned Constituent is not directly possible, it is sometimes possible that a new Constituent can be linked to it, within which the rapture of the Word is possible. It should also be noted that the inclusion of a Word structure in an open Constituent structure on the basis of a number of grammar rules and because of the number of Word structures (1 max, k) in that Word (k) can result in a certain multiplication of the number of Constituent structures. . Since the results of the parsing process will always be included in the Representation memory, and the current Constituent structures occurring therein must be taken as input values for the parsing process, in step 17 a memory part of the computer, to be designated as "Temporary Memory", is first empty, the Representation Memory with all its Constituent Structures is then transferred to the Temporary Memory in step 18, and then the Representation Memory is empty in step 19. Several variants of the following program section 20 are possible. These will be explained successively with reference to Figures 3, 4 and 5.

Er van uitgaande dat bij een Woord (k) met meerdere Woordstructuren (1) voor iedere structuur de aankoppelingsmogelijkheid nagegaan moet worden, omvat de computer een teleenheid (zogenaamde 1-teller), waarvan de tel stand telkenmale correspondeert met de betreffende woordstructuur.Assuming that with a Word (k) with several Word structures (1) the connection possibility must be checked for each structure, the computer comprises a counting unit (so-called 1-counter), the counting position of which corresponds each time to the relevant word structure.

Voor een volgend woord (k+1), dat in de aanwezige zinsdeel structuur (Sm) ondergebracht dient te worden, moet ook de 1 -te1 Ier weer naar 0 worden teruggesteld, hetgeen eveneens het geval is met de teleenheid (zogenaamde m-teller), welke de rangnummers (m) van de constituenten representeert.For a next word (k + 1), which must be included in the existing phrase structure (Sm), the 1-to-1 Irish must also be reset to 0, which is also the case with the counting unit (so-called m-counter ), which represents the rank numbers (m) of the constituents.

Dienovereenkomstig zal bij stap 21 in Fig. 3 de tel stand van de 1-teller naar 0 worden teruggesteld (1=0), waarna bij stap 22 de tel stand van de 1-teil er met "1" wordt opgehoogd (1=1+1).Accordingly, at step 21 in FIG. 3 the count position of the 1-counter is reset to 0 (1 = 0), after which in step 22 the count position of the 1-counter is increased by "1" (1 = 1 + 1).

Bij stap 23 wordt de vraag gesteld of de tel stand van de 1-teil er de maximumwaarde (lmax ^), welke het aantal Woordstructuren van het woord met rangnummer k representeert, reeds heeft overschreden (1>^max)·In step 23 the question is asked whether the count of the 1-te er has already exceeded the maximum value (lmax ^), which represents the number of Word structures of the word with rank number k (1> ^ max)

Bij een positieve beantwoording (Y) van deze vraag gaat het programma verder naar stap 24.If the question is answered positively (Y), the program proceeds to step 24.

Wordt de vraag bij stap 23 echter in negatieve zin (N) beantwoord, dan zal bij de volgende stap 25 de 1-de Woordstructuur uit het Woordge-heugen naar het Werkgeheugen worden gebracht ter overschrijving van de vorige Woordstructuur, teneinde op de nieuw ingeschreven Woordstructuur een aantal bewerkingen te kunnen uitvoeren. Vervolgens zal bij stap 26 de m-teller naar 0 worden teruggesteld (m=Q), en wordt vervolgens bij stap 27 de tel stand van de m-teller met "1" verhoogd (m=m+l). Daarna vindt bij stap 28 de vraagstel 1ing plaats of de tel stand van de m-teller het getal ι%3Χ, dat het aantal alsdan voorkomende constituenten representeert, heeft overschreden (m>mmax). Is dit het geval (Y) dan is daarmee vast komen te staan, dat het in te passen woord (k) met de bewuste structuur (1) op aansluiting in alle op dat moment in het Tijdelijke Geheugen aanwezige Constituentstructuren onderzocht is, en dat derhalve dit programmaonderdeel herhaald dient te worden bij hetzelfde woord (k) maar dan voor een volgende Woordstructuur (1+1). Dienovereenkomstig maakt het programma een sprong terug, en komt het weer terecht bij stap 22. Bij een negatieve beantwoording (N) van de vraagstelling bij stap 28 volgt stap 29, alwaar de m-de Constituent uit het Tijdelijk Geheugen wordt opgehaald en naar het Werkgeheugen ter overschrijving van de vorige Constituent wordt gebracht. Aldus bevinden zich dus zowel de 1-de Woordstructuur van het k-de Woord en de m-de Constituent in het Werkgeheugen. Met deze twee structueren (Woordstructuur en Constituentstructuur) wordt in het programmagedeelte 30 een aantal bewerkingen uitgevoerd.However, if the question in step 23 is answered in a negative sense (N), then in the next step 25 the 1st Word structure will be brought from the Word memory to the Working memory to transfer the previous Word structure, in order to transfer to the newly registered Word structure perform a number of operations. Then, in step 26, the m-counter will be reset to 0 (m = Q), and then in step 27, the count of the m-counter is increased by "1" (m = m + 1). Then, at step 28, the question is asked whether the count of the m-counter has exceeded the number ι% 3Χ, which represents the number of constituents occurring at that time (m> mmax). If this is the case (Y), then it has been established that the word (k) to be fitted with the conscious structure (1) has been investigated in connection with all Constituent structures present in the Temporary Memory, and that therefore this part of the program must be repeated for the same word (k) but then for a subsequent Word structure (1 + 1). Accordingly, the program jumps back, and returns to step 22. If the question is answered negatively (N) in step 28, step 29 follows, where the mth Constituent is retrieved from Temporary Memory and sent to Working Memory. to transfer the previous Constituent. Thus, both the 1-th Word structure of the k-th Word and the m-th Constituent are in the Working Memory. With these two structures (Word structure and Constituent structure) a number of operations are performed in the program section 30.

In feite wordt dan geprobeerd de Woordstructuur binnen de Consti-tuentstructuur onder te brengen, eventueel door het aanmaken van andere Constituentstructuren. Het programmagedeelte is verder in Fig. 6 uitgewerkt. Na uitvoering van het programmagedeelte 30 gaat het programma weer terug naar stap 27.In fact, an attempt is then made to place the Word structure within the Constituent structure, possibly by creating other Constituent structures. The program section is further shown in FIG. 6 elaborated. After execution of the program section 30, the program returns to step 27.

Wordt bij stap 28 de vraagstelling bevestigend beantwoord (Y), dan keert het programma terug naar stap 22. Daarmee is het proces om een woord rechts in the passen (word right associate-proces) voor één woordstructuur (1) bij één woord (k) op alle actuele constituenten (%3χ) uitgevoerd.If in step 28 the question is answered in the affirmative (Y), the program returns to step 22. This means that the process is to fit a word right (word right associate-process) for one word structure (1) at one word (k ) performed on all actual constituents (% 3χ).

Wordt op zeker moment daarna bij stap 23 de vraagstelling bevestigend beantwoord (Y), dan gaat het programma verder naar stap 24. Op dat moment is het proces om "een woord rechts in te passen" voor alle woordcategorieën (lmax) bij een woord (k) op alle actuele constituen-ten (mmax) uitgevoerd, en zijn de aldus gevormde constituenten opgeslagen in het Representatiegeheugen. Al deze constituenten dienen nu aan een programmagedeelte voor het "afsluiten van constituenten" onderworpen te worden. Hiertoe zal eerst bij stap 24 het Tijdelijk Geheugen worden leeggemaakt en zullen bij de volgende stap alle in het Representatiegeheugen' opgeslagen constituenten naar het lijdelijk Geheugen worden overgebracht. Vervolgens wordt bij stap 32 het Representatiegeheugen leeggemaakt. Bij de daarop volgende stap 33 wordt een m*-teller, waarvan de tel stand correspondeert met een bepaalde constituent in het Tijdelijk Geheugen, teruggesteld op 0 (m*=0). Bij de volgende stap 34 wordt de tel stand van de m*-te!ler met "1" verhoogd (m*=m*+l). Daarna wordt bij stap 35 de vraag gesteld of de m*-teller de maximale waarde (m*max) overeenkomende met het aantal constituenten reeds heeft overschreden (m*>m*max). Wordt op deze vraagstelling bevestigend geantwoord (Y), dan zijn alle representaties (m*max), die verkregen zijn bij alle woordcategorieën (lmax) van het k-de woord onderworpen geweest aan het programmagedeelte voor het "afsluiten van constituenten". Het programma keert dan terug naar stap 14 om met behulp van het volgende woord (k+1) representaties te maken.If at some point after the question is answered in the affirmative at step 23 (Y), the program proceeds to step 24. At that moment the process is to "insert a word to the right" for all word categories (lmax) at a word ( k) performed on all actual constituents (mmax), and the constituents thus formed are stored in the Representation memory. All these constituents now need to be subjected to a "closing constituents" program section. To this end, the Temporary Memory will first be emptied at step 24 and all the constituents stored in the Representation Memory will be transferred to the temporary Memory at the next step. Then, at Step 32, the Representation Memory is cleared. In the subsequent step 33, an m * counter, the count of which corresponds to a particular constituent in the Temporary Memory, is reset to 0 (m * = 0). In the next step 34, the count position of the m * counter is increased by "1" (m * = m * + l). Then, in step 35, the question is asked whether the m * counter has already exceeded the maximum value (m * max) corresponding to the number of constituents (m *> m * max). If this question is answered in the affirmative (Y), then all representations (m * max) obtained in all word categories (lmax) of the kth word have been subject to the program section for "closing constituents". The program then returns to step 14 to make representations using the next word (k + 1).

Wordt echter de vraagstelling bij stap 35 ontkennend beantwoord (N), dan wordt bij de volgende stap 36 de m*-de representatie uit het Tij- del ijk Geheugen gehaald en vervolgens bij stap 37 aan het reeds genoemde programmagedeelte voor het "afsluiten van constituenten" onderworpen. Het tijdens deze stap verkregen resultaat wordt hieroij in het Representatiegeneugen geschreven, waarna het programma terugkeert naar stap 34.If, however, the question is answered in the negative in step 35 (N), then in the next step 36 the m * -th representation is taken from the Contemporary Memory and then in step 37 to the previously mentioned program section for "closing constituents". "subject. The result obtained during this step is written in the Representation Pleasure, after which the program returns to step 34.

Wordt bij stap 15 in Fig. 2 de vraagstelling bevestigend beantwoord (Y), dan zijn alle woorden (kmax) met alle bijbehorende Woordstructuren Omax k) in de constituenten ondergebracht, en wordt bij de volgende stap 38 de vraag gesteld of in het Representatiegeheugen tenminste één Constituent voorkomt, waarvan het Opdrachtveld leeg is. Een leeg Opdrachtveld geeft immers aan, dat we dan te maken hebben met de bovenste Constituent of Topconstituent (Root). Deze vraagstelling is reëel, daar in de stappen 30 en 37 een filterprocedure is ingebouwd, welke ondeugdelijke representaties wegfiltert (elimineert), en wel aan de hand van een drempel schakeling, waarin de waarde van het Waarschijnlijkheidsveld vergeleken wordt met de actuele drempelwaarde. Dit vergelijkend onderzoek zal nader in de figuren 6 t/m 9 worden uiteengezet.At step 15 in FIG. 2 the question is answered in the affirmative (Y), then all words (kmax) with all associated Word structures Omax k) are accommodated in the constituents, and in the next step 38 the question is asked whether at least one Constituent exists in the Representation memory, of which the Command field is empty. After all, an empty Command field indicates that we are dealing with the upper Constituent or Top constituent (Root). This question is real, because in steps 30 and 37 a filtering procedure has been built in, which filters out (eliminates) faulty representations, and this by means of a threshold circuit in which the value of the Probability field is compared with the current threshold value. This comparative study will be further explained in Figures 6 to 9.

Bij een ontkennende beantwoording (N) van de vraagstelling bij stap 38 dient nagegaan te worden of het aldus beschreven programmagedeelte over de parser bij een lagere drempelwaarde wel een bruikbaar resultaat, d.w.z. een zinvolle representatie zal opleveren. Echter raag de drempel waarde niet beneden een bepaalde waarde, bijvoorbeeld 0,2 worden gebracht, daar dan geacht wordt dat de aangeboden zin structureel en grammaticaal zodanig fout is, dat geen bruikbare representatie uit het parserproces verwacht mag worden. Dienovereenkomstig zal bij een ontkennende beantwoording (N) van de vraagstelling bij stap 38 gedurende de volgende stap 39 de vraag worden gesteld of de drempelwaarde nog boven een vooraf gestelde minimumwaarde, bijvoorbeeld 0,2 ligt. Bij een bevestigend antwoord (Y) zal bij de volgende stap 40 de drempel waarde volgens een vaste reeks getalswaarden op een volgende, lagere getalswaarde worden gesteld. Het programma zal vervolgens weer terugkeren naar stap 12 om het programma met betrekking tot het parseren van de aangeboden zin weer opnieuw uit te voeren, maar dan aan de hand van een lagere drempelwaarde. Daarmee zijn ook grammaticaal niet-correcte zinnen toegestaan. Wordt de vraagstelling bij stap 39 ontkennend beantwoord (N), en moet er dus vanuit worden' FIf there is a negative answer (N) to the question in step 38, it must be ascertained whether the program section thus described about the parser will yield a useful result, i.e. a meaningful representation, at a lower threshold value. However, the threshold value should not be brought below a certain value, for example 0.2, since it is then considered that the sentence presented is structurally and grammatically wrong in such a way that no useful representation from the parsing process should be expected. Accordingly, if the question is answered negatively (N) at step 38 during the next step 39, the question will be asked whether the threshold value is still above a predetermined minimum value, for example 0.2. In the affirmative answer (Y), in the next step 40, the threshold value will be set to a next lower number value according to a fixed series of number values. The program will then return to step 12 to rerun the program with regard to parsing the sentence presented, but at a lower threshold value. This also allows grammatically incorrect sentences. If the question in question 39 is answered in the negative (N), it must therefore be assumed 'F

gegaan, dat er geen bruikbare analyse te vinden is, dan zal een daarmee overeenstemmende actie moeten worden genomen. De zin kan eventueel worden aangemerkt als zijnde verdacht, met vermoedelijk veel en/of zeer ernstige grammaticale fouten. Eventueel kan hiervan ook nog een melding worden gemaakt. Het is toegestaan om de stap 40 te laten voorafgaan aan stap 39, waarbij stap 39 dan een andere waarde als criterium krijgt. Bij een bevestigend antwoord (Y) op de vraag van 'stap. 38, en dus bij aanwezigheid van tenminste één representatie.·van Topconstituent worden bij de volgende stap 42 de structuurelementen-'-''j velden van de aanwezige Topconstituenten aangepast. Daarbij Wordt uitgegaan van de situatie dat de bij de zin behorende strüctüüri^eé^^ is aangemaakt. Een aantal Opdrachtvelden moet echter nog worden^inge*££ vuld of aangepast. De in het Representatiegeheugen voorkonehdë\(t^S§^i tuenten (Parentconstituenten) zullen in hun structuurelementvëlÖjëenSö aantal Constituenten (Memberconstituenten) en/of Woorden ' -vy·.j.;y.v (Memberwoorden ) bevatten. Een memberconstituent is een constituent’ die.’ gedomineerd wordt door diens parentconstituent. Het Opdrachtvel^'Vart^i deze Memberconsti tuenten en Memberwoorden wordt nu opgevuld met·-een aanduiding of verwijzing naar de desbetreffende Parentconstituéntj jOp./. deze manier worden de Opdrachtvel den van alle Memberconsti tuentén.yen:t|^ Memberwoorden, welke in het Structuurelement van een bepaalde Parent-'·"· constituent vermeld staan, ook voorzien van een aanduiding of ver- ''-'j7--wijzing betreffende deze Parentconstituent. Hetzelfde gebeurt ook voor1 ieder Opdrachtveld van de Constituenten en de Woorden binnen een Constituent, waarvan het Opdrachtveld tijdens het parseerproces is ' .· ingevuld.If no useful analysis can be found, a corresponding action will have to be taken. The sentence can possibly be classified as suspicious, with presumably many and / or very serious grammatical errors. If necessary, a notification can also be made. It is allowed to let step 40 precede step 39, wherein step 39 then takes another value as a criterion. In the affirmative answer (Y) to the question of 'step. 38, and thus in the presence of at least one representation of Top Constituent, in the next step 42 the structural elements of the Top constituents present are adjusted. This is based on the situation that the sentence associated with the sentence has been created. However, some Command Fields have yet to be completed or edited. The tents (Parent constituents) in the Representation Memory will contain in their structural element many ÖjëenSö number Constituents (Member constituents) and / or Words '-vy · .j.; Yv (Memberwords). A member constituent is a constituent' which. "is dominated by its parent constituent. The Command Sheet ^" These Member Constituents and Member Words are now filled with · -an designation or reference to the appropriate Parent Constituents, thus becoming the Command Fields of all Member Constituents. yen: t | ^ Member words, which are mentioned in the Structural element of a particular Parent constituent, also provided with an indication or reference concerning this Parent constituent. The same also happens for 1 Command field. of the Constituents and the Words within a Constituent, whose Command Field has been completed during the parsing process.

Daar tijdens het parseerproces tijdelijke functionele label s: kunnen ’ zijn toegekend worden deze bij stap 43 vervangen door definitieve. In de nog te beschrijven voorbeeldgrammatica wordt het voorlopigéylabel’j fNP gebruikt voor iedere Constituent met de Categorie NP binnendonder ;i andere een Constituent met Categorie S. Tijdens deze stap worden dergelijke labels vervangen door de definitieve labels "Onderwérp"' (Subject), "Lijdend Voorwerp" (Object), "Meewerkend Voorwerp" ; ' , (Indobj.) en "Predicaat" (Pred). Deze stap 44 zal in een la ter} stadium; verder worden uiteengezet. Bij stap 44 wordt tenslotte uit het'Repre-' sentatiegeheugen die Repesentatie geselecteerd, welke de hoogste waarschijn!ijkheidsfactor bezit. Ook kan dit resulteren in meerdereSince temporary functional labels may have been assigned during the parsing process, these are replaced in step 43 by definitive ones. In the example grammar yet to be described, the preliminary fNP is used for each Constituent with Category NP thunder; i another Constituent with Category S. During this step, such labels are replaced by the final labels "Subject" "(Subject)," Suffering Object "(Object)," Contributing Object "; (Indobj.) And "Predicate" (Pred). This step 44 will be at a later stage; further explained. Finally, at Step 44, from the "Reproduction" memory, that Representation is selected, which has the highest probability factor. This can also result in several

Representaties. Zo zullen er in de zin "Ik zag een meisje met de verrekijker." twee oplossingen mogelijk zijn, daar het zinsdeel "met de verrekijker" een nadere bepaling bij het "werkwoord" alswel bij het "Lijdend Voorwerp" kan zijn.Representations. For example, in the sentence "I saw a girl with the binoculars." two solutions are possible, since the phrase "with the binoculars" can be a more detailed definition of the "verb" as well as of the "Suffering Object".

Een tweede uitvoeringsvorm van het programmagedeelte 20 zal nu eveneens aan de hand van Fig. 3 worden uiteengezet. Daarbij verwijst de index 1 nu naar een zekere consti tuentstructuur (en niet naar een zekere Woordstructuur), terwijl daardoor de index m verwijst naar een bepaalde woordstructuur. Als lopende index bij de stappen 33 tot en met 37 moet dan 1* resp. l*max i-p-v· m* en m*max gehanteerd worden. Dienovereenkomstig wordt bij stap 21 de tel stand van de 1-teller met betrekking tot het rangnummer van de te gebruiken constituent teruggesteld op "0" (1 =0), zal bij stap 22 de tel stand van de 1-teller met "1" worden verhoogd (1=1+ 1), en zal bij stap 23 de vraag worden gesteld of de tel stand van de 1-teiler de maximale waarde Umax), overeenkomend met het aanwezige aantal constituenten, reeds overschreden heeft (1 > lmax). Bij stap 25 wordt de door de tel stand van de 1-teller aangeduide constituent uit het lijdelijk Geheugen opgehaald, welke constituent de vorige Constituent in het Werkgeheugen zal overschrijven. Voorts wordt bij stap 26 de m-teller, waarvan de tel stand correspondeert met het rangnummer van een zekere Woordcategorie, teruggesteld op 0 (m = 0). Bij stap 27 wordt de tel stand van de m-teller met "1" verhoogd (m = m + 1) en bij stap 28 wordt de vraag gesteld of de tel stand van de m-teller de maximale waarde %3Χ, overeenkomend met het aantal woordstructuren, reeds heeft overschreden (m > ί%άΧ). Bij stap 29 wordt het m-de Woord uit het Woordgeheugen opgehaald en zal het m-de Woord het vorige woord in het werkgeheugen overschrijven. Het m-de Woord en de 1-de Constituent worden dan onderworpen aan het aanpassingsproces van stap 30.A second embodiment of the program section 20 will now also be described with reference to FIG. 3 are explained. In addition, the index 1 now refers to a certain constituent structure (and not to a certain Word structure), while thereby the index m refers to a certain word structure. As a running index at steps 33 to 37, 1 * resp. l * max i-p-v · m * and m * max are used. Accordingly, at step 21, the count of the 1-counter with respect to the rank number of the constituent to be used is reset to "0" (1 = 0), at step 22, the count of the 1-counter with "1" (1 = 1 + 1), and in step 23 the question will be asked whether the count of the 1-boiler has already exceeded the maximum value Umax), corresponding to the number of constituents present (1> 1max). In step 25, the constituent indicated by the count of the 1-counter is retrieved from the temporal memory, which constituent will overwrite the previous constituent in the working memory. Furthermore, at step 26, the m-counter, the count of which corresponds to the rank number of a certain Word category, is reset to 0 (m = 0). In step 27 the count position of the m counter is increased by "1" (m = m + 1) and in step 28 the question is asked whether the count position of the m counter has the maximum value% 3Χ, corresponding to the number of word structures, has already exceeded (m> ί% άΧ). At step 29, the mth Word is retrieved from the Word memory and the mth Word will overwrite the previous word in the working memory. The m-th Word and the 1-th Constituent are then subjected to the adjustment process of step 30.

Een bevestigend antwoord (Y) op de vraag bij stap 28 houdt in, dat alle woordstructuren (ι%3Χ) bij één woord (k) op aansluiting op één Constituent (1) in het programmagedeelte 20 zijn onderzocht. De overige stappen verlopen overeenkomstig als reeds is beschreven.An affirmative answer (Y) to the question at step 28 implies that all word structures (ι% 3Χ) at one word (k) after connection to one Constituent (1) in the program section 20 have been examined. The other steps proceed as described above.

Bij de reeds beschreven uitvoeringsvormen worden eerst alle mogelijke representaties van constituenten aangemaakt of uitgebreid, welke bij één woord (k) voor alle woordstructuren aan de hand van beschikbare constituenten moge!ijk zijn, voordat het programmagedeelte 37 voor het "afsluiten van deze constituenten" wordt uitgevoerd.In the embodiments already described, all possible representations of constituents are first created or expanded, which one word (k) may be for all word structures on the basis of available constituents, before the program section 37 for "closing these constituents" is executed.

In de volgende uitvoeringsvorm worden eerst alle mogelijke representaties (m*max) van constituenten aangemaakt of uitgebreid, welke bij één Woord (k) voor één Woordstructuur (1) aan de hand van alle constituenten (mmax) mogelijk zijn, voordat het programmagedeel te 37 voor het afsluiten van deze constituenten (m*) wordt uitgevoerd. Daarna wordt hetzelfde programmagedeelte herhaald, maar dan voor een volgende Woordstructuur (1 +1).In the following embodiment, all possible representations (m * max) of constituents are first created or expanded, which are possible with one Word (k) for one Word structure (1) on the basis of all constituents (mmax) before the program part is completed. for closing these constituents (m *) is performed. Then the same part of the program is repeated, but for the next Word structure (1 +1).

De op deze gedachtengang berustende derde uitvoeringsvorm is in Fig. 4 weergegeven.The third embodiment based on this train of thought is shown in FIG. 4 is shown.

Bij stap 21 wordt de 1-telIer betreffende het rangnummer van een Woordstructuur bij Woord (k) op "0" teruggesteld (1 =0), en bij stap 22 wordt de tel stand van de 1-tel Ier met "1" verhoogd (1 =1 +1). Vervolgens wordt bij stap 23 de vraag gesteld of de tel stand van de 1-teller de maximale waarde dmax) reeds heeft overschreven (1 > lmax). Bij stap 25 wordt het woord met de door de 1-telIer gedefinieerde Woordstructuur uit het Woordgeheugen gehaald en in het Werkgeheugen geplaatst. Vervolgens wordt bij stap 26 de m-teller waarvan de tel stand het rangnummer van een constituent in het Tijdelijk Geheugen representeert, teruggesteld naar "0" (m = 0), en wordt bij stap 27 de tel stand van de m-teller met "1" verhoogd (m = m + 1). Bij stap 28 wordt de vraag gesteld of alle constituenten (fnmax) reeds op aansluiting door het programmagedeelte van stap 30 zijn onderzocht (m > mmax). Wordt de vraag ontkennend beantwoord (NI), dan wordt bij stap 29 de door de tel stand van de m-teller gedefinieerde constituent uit het Tijdelijk Geheugen te voorschijn gehaald en in het Werkgeheugen geplaatst, waarna bij stap 30 de aansluiting van de 1-de Woordstructuur op de m-de Constituentstructuur wordt onderzocht, en het resultaat ervan in het Representatiegeheugen wordt ingeschreven.At step 21, the 1-count concerning the rank number of a Word structure at Word (k) is reset to "0" (1 = 0), and at step 22, the count of the 1-count is raised by "1" ( 1 = 1 +1). Then, in step 23, the question is asked whether the count of the 1-counter has already overwritten the maximum value dmax) (1> 1max). In step 25, the word with the Word structure defined by the 1-counter is taken from the Word memory and placed in the Working memory. Then, in step 26, the m-counter whose count position represents the rank number of a constituent in the Temporary Memory is reset to "0" (m = 0), and in step 27 the count position of the m-counter is reset by " 1 "raised (m = m + 1). In step 28 the question is asked whether all constituents (fnmax) have already been examined for connection by the program part of step 30 (m> mmax). If the question is answered in the negative (NI), in step 29 the constituent defined by the count of the m-counter is taken out of the Temporary Memory and placed in the Working Memory, after which in step 30 the connection of the 1-th Word structure on the mth Constituent structure is examined, and the result thereof is written in Representation memory.

Bij een bevestigende beantwoording van de vraagstelling bij stap 28 worden bij de volgende stap 45 de inhouden van het Representatiegeheugen en die van Tijdelijk Schakel geheugen met elkaar verwisseld, waarna bij stap 32 het Representatiegeheugen wordt leeggemaakt. Vervolgens wordt bij stap 33 de m*-teller, waarvan de tel stand het rangnummer van een bepaalde Constituent in het Tijdelijk Schakel geheugen repesenteert, teruggesteld naar "0" (m* = 0). Bij stap 34 wordt de tel stand van de m*-teller met "1" verhoogd (m* = m* + 1), en wordt bij stap 35 de vraag gesteld of de tel stand van de m*-teller de maximale waarde (m*max) reeds heeft overschreden (m* > m*max). Is dit niet het geval (NI), dan wordt de door de m*-te11er aangeduide constituent bij stap 36 uit het Tijdelijk Schakel geheugen gehaald en bij stap 37 aan het programmagedeelte voor het afsluiten van Constituent onderworpen. Het resultaat wordt vervolgens in het Represen-tatiegeheugen opgeborgen, en het programma gaat terug naar stap 34. Wordt de vraagstelling bij stap 35 bevestigend (Y) beantwoord, dan wordt bij de volgende stap 47 de inhoud van het Representatiegeheugen gekopieerd naar een buffergeheugen en bij stap 48 het Representatiegeheugen leeggemaakt.In the affirmative answer to the question in step 28, in the next step 45 the contents of the Representation memory and that of the Temporary Switch memory are exchanged with each other, after which in step 32 the Representation memory is cleared. Next, at step 33, the m * counter, whose count represents the rank number of a particular Constituent in the Temporary Switch memory, is reset to "0" (m * = 0). In step 34 the count position of the m * counter is increased by "1" (m * = m * + 1), and in step 35 the question is asked whether the count position of the m * counter has the maximum value ( m * max) has already exceeded (m *> m * max). If this is not the case (NI), the constituent indicated by the m * -teer is removed from the Temporary Switch memory at step 36 and subjected to the program part for closing Constituent at step 37. The result is then stored in the Represents memory, and the program goes back to step 34. If the question is answered affirmatively (Y) in step 35, in the next step 47 the contents of the Representation memory are copied to a buffer memory and step 48 the Representation memory is cleared.

Daarna gaat het programma terug naar stap 22. Het aldus uitgevoerde programmagedeelte wordt dan herhaald, maar dan voor de volgende Woordstructuur (1+1) bij het k-de woord.The program then goes back to step 22. The program part thus executed is then repeated, but then for the next Word structure (1 + 1) at the kth word.

Wordt de vraagstelling bij stap 23 bevestigend beantwoord (Y), dan gaat het programma naar stap 46, waarbij het buffergeheugen naar het Representatiegeheugen gekopieerd wordt en vervolgens het buffergeheugen leeggemaakt wordt. Daarna gaat het programma terug naar stap 14 (zie Fig. 2).If the question in question 23 is answered in the affirmative (Y), the program goes to step 46, in which the buffer memory is copied to the Representation memory and then the buffer memory is emptied. The program then goes back to step 14 (see Fig. 2).

In plaats van het resultaat van de processtappen 37 en 30 eerst in het Representatieve geheugen te plaatsen en vervolgens in het Buffergeheugen is het ook mogelijk het resultaat direct in het Buffergeheugen te plaatsen. Wél zal dit de nodige consequenties hebben bij de behandeling van deze programmagedeelten in Fig. 6 en 9.Instead of placing the result of the process steps 37 and 30 first in the Representative memory and then in the Buffer memory, it is also possible to place the result directly in the Buffer memory. However, this will have the necessary consequences when dealing with these program parts in Fig. 6 and 9.

Een vierde uitvoeringsvorm, welke op overeenkomstige wijze van bovenstaande uitvoeringsvorm verschilt als de tweede uitvoeringsvorm ten opzichte van de eerste uitvoeringsvorm zal nu nader worden omschreven. Daarbij verwijzen de indices 1 en m naar een rangnummer met betrekking tot een Constituent respectievelijk een Woord. Als lopende index bij de stappen 33 tot en met 37 zal de aanduiding 1* worden gehanteerd.A fourth embodiment, which similarly differs from the above embodiment as the second embodiment from the first embodiment, will now be described in more detail. The indices 1 and m refer to a rank number with respect to a Constituent and a Word, respectively. The indication 1 * will be used as the running index in steps 33 to 37.

Dienovereenkomstig wordt bij stap 21 de tel stand van de 1-teil er met verwijzing naar de te gebruiken Constituent teruggesteld op "0",(1 = 0), zal bij stap 22 de tel stand van de 1-teller met "1" worden verhoogd (1 = 1 + 1), en zal bij stap 23 de vraag worden gesteld of de tel stand van de 1-telIer de maximale waarde (lmax), overeenkomend met het aantal aanwezige constituenten, reeds heeft overschreden (1 > lmax)· Bij stap 25 wordt de door de tel stand van de 1-tel Ier aangeduide Constituent uit het Tijdelijk Geheugen opgehaald en in het Werkgeheugen geplaatst. Voorts wordt bij stap 26 de m-teller, waarvan de tel stand correspondeert met het rangnummer van een zekere Woordstructuur, teruggesteld op "0" (m = 0). Bij stap 27 wordt de tellerstand van de m-teller met "1" verhoogd (m = m + 1) en bij stap 28 wordt de vraag beantwoord of de tel stand van de m-teller de maximale waarde mmax> overeenkomend met het aantal woordstructuren, reeds heeft overschreden (m > ιτ^χ).Accordingly, at step 21, the count of the 1 counter is reset to "0" with reference to the Constituent to be used, (1 = 0), at step 22, the count of the 1-counter is reset to "1" increased (1 = 1 + 1), and in step 23 the question will be asked whether the count of the 1-grower has already exceeded the maximum value (lmax), corresponding to the number of constituents present (1> lmax) In step 25, the Constituent indicated by the count position of the 1-count Irish is retrieved from the Temporary Memory and placed in the Working Memory. Furthermore, at step 26, the m-counter, the count of which corresponds to the rank number of a certain Word structure, is reset to "0" (m = 0). In step 27 the counter position of the m counter is increased by "1" (m = m + 1) and in step 28 the question is answered whether the count position of the m counter has the maximum value mmax> corresponding to the number of word structures , has already exceeded (m> ιτ ^ χ).

Bij stap 29 wordt de m-de woordstructuur uit het Woordgeheugen opgehaald en in het Werkgeheugen geplaatst. Deze m-de Woordstructuur en de 1-de Constituent worden dan onderworpen aan het inpassingsproces van stap 30, waarbij het resultaat in het Representatiegeheugen wordt opgeslagen, en het programma terugkeert naar stap 27. Aldus houdt een bevestigend antwoord (Y) op de vraag bij stap 28 in, dat alle Woordstructuren (mmax) bij één woord (k) op aansluiting op één constituent (1) in het programmagedeelte 30 zijn onderzocht; hierna wordt bij stap 45 de inhoud van het Representatiegeheugen in het Tijdelijk Schakel geheugen overgevoerd, en wordt het Representatiegeheugen leeggemaakt. Vervolgens begint bij stap 33 een fase, welke leidt tot het afsluitproces van de aanwezige Constituenten bij stap 37. Wordt bij stap 35 de vraagstelling bevestigend beantwoord (Y), dan gaat het programma naar stap 47, waarna het aldus beschreven programmagedeelte herhaald wordt, maar dan voor de volgende Constituent (1 + 1) in het Tijdelijk Geheugen. Wordt de vraagstelling bij stap 23 bevestigend beantwoord (Y), dan keert het programma via stap 46 terug naar stap 14 (zie Fig. 2).At step 29, the mth word structure is retrieved from the Word memory and placed in the Working memory. This m-th Word structure and the 1-th Constituent are then subjected to the incorporation process of step 30, the result being stored in Representation memory, and the program returning to step 27. Thus, an affirmative answer (Y) to the question keeps track step 28, that all Word structures (mmax) at one word (k) after connection to one constituent (1) in the program section 30 have been examined; then, at step 45, the contents of the Representation Memory are transferred to the Temporary Switch memory, and the Representation Memory is cleared. Subsequently, a phase begins at step 33, which leads to the closing process of the Constituents present at step 37. If the question is answered in the affirmative at step 35 (Y), the program proceeds to step 47, after which the program part thus described is repeated, but then for the next Constituent (1 + 1) in the Temporary Memory. If the question in question 23 is answered in the affirmative (Y), the program returns via step 46 to step 14 (see Fig. 2).

In Fig. 5 is een vijfde uitvoeringsvorm van het programmagedeelte 20 weergegeven, welke berust op de gedachte om telkenmale nadat één Woord en één Constituent zijn overgebracht naar het Werkgeheugen, zowel het programmagedeelte 30 omtrent het inpassen van dat Woord aan die Constituent, alswel direct daarop volgend het programmagedeelte 37 omtrent het afsluiten van de verkregen Constituenten te laten plaatsvinden. Dienovereenkomstig wordt bij stap 21 de 1-tel 1 er, waarvan de tel stand het rangnummer van de Woordstructuur representeert op "0" teruggesteld (1 = 0), wordt bij stap 22 de tel stand van de T-tel 1 er met "1" verhoogd (1 = 1 + 1), en wordt bij stap 23 de vraag gesteld of de tel stand met de 1-tel 7er de maximale waarde (lmax), welke overeenkomt met het aantal Woordstructuren bij het k-de woord, reeds heeft overschreden. Is dit niet het geval (Ni), dan wordt bij stap 25 de door de 1-telIer aangeduide 1-de Woordstructuur uit het Woordge-heugen gehaald en in het Werkgeheugen geschreven. Vervolgens wordt bij stap 26 de m-teller, waarvan de tel stand met het rangnummer van de inde constituent in het Tijdelijk Geheugen correspondeert, teruggesteld op "O" (m = O), en bij de volgende stap 27 de tel stand van m-teller met "1" verhoogd (m = m + 1). Bij stap 28 wordt de vraag gesteld of de tel stand van de m-teller de maximale waarde (mmax), overeenkomende met het aantal constituenten in het Tijdelijk Geheugen, reeds heeft overschreden (m > ι%3Χ). Is dit niet het geval (N), dan wordt bij de volgende stap 29 de m-de Constituent uit het Tijdelijk Geheugen naar het Werkgeheugen geschreven. Vervolgens wordt het programmagedeelte 30 betreffende de aansluiting van de 1-de Woordstructuur aan de m-de Constituentstructuur uitgevoerd, en in het Representatiegeheugen opgenomen. Direct daarna wordt de aangevulde (of nieuwe) constituent(-en) via het Representatiegeheugen opgehaald en verwijderd en vervolgens aan het programmagedeelte 37 betreffende de afsluiting ervan onderworpen. Tijdens stap 37 wordt het resultaat in het Representatiegeheugen ingeschreven, waarna het programma terugkeert naar stap 27.In FIG. 5 shows a fifth embodiment of the program part 20, which is based on the idea that each time after one Word and one Constituent have been transferred to the Working Memory, both the program part 30 regarding the fitting of that Word to that Constituent, as well as immediately after the program part 37 to take place regarding the closing of the obtained Constituents. Accordingly, at step 21, the 1-count 1 er, whose count position represents the rank number of the Word structure, is reset to "0" (1 = 0), at step 22, the count position of the T-count 1 is reset with "1" "increased (1 = 1 + 1), and in step 23 the question is asked whether the count with the 1-count 7er already has the maximum value (lmax), which corresponds to the number of Word structures with the kth word exceeded. If this is not the case (Ni), then in step 25 the 1 st Word structure indicated by the 1-counter is taken from the Word memory and written into the Working memory. Then, in step 26, the m-counter, whose count position corresponds to the rank number of the constituent in the Temporary Memory, is reset to "O" (m = O), and in the next step 27 the count position of m- counter increased by "1" (m = m + 1). At step 28, the question is asked whether the count of the m-counter has already exceeded the maximum value (mmax), corresponding to the number of constituents in the Temporary Memory (m> ι% 3Χ). If this is not the case (N), in the next step 29 the mth Constituent is written from the Temporary Memory to the Working Memory. Then, the program portion 30 regarding the connection of the 1 st Word structure to the m th Constituent structure is executed and included in the Representation Memory. Immediately thereafter, the replenished (or new) constituent (s) is retrieved and removed from the Representation Memory and then subjected to the program portion 37 regarding its termination. During step 37, the result is written into Representation memory, after which the program returns to step 27.

Wordt de vraagstelling van stap 28 bevestigend beantwoord (Y), dan keert het programma terug naar stap 22.If the question of step 28 is answered in the affirmative (Y), the program returns to step 22.

Wordt bij stap 23 de vraagstelling bevestigend beantwoord (Y}, dan keert het programma terug naar stap 14 (zie Fig. 2).If the question is answered in the affirmative at step 23 (Y}, the program returns to step 14 (see Fig. 2).

Bij een zesde uitvoeringsvorm worden de indices 1 en m in vergelijking met die van de vijfde uitvoeringsvorm verwisseld.In a sixth embodiment, indices 1 and m are swapped compared to those in the fifth embodiment.

Derhalve hebben de stappen 21, 22, 23 en 25 betrekking op de Consti-tuentrepresentatie met index 1 en stappen 26, 27, 28 en 29 op de Woordstructuur met index m.Therefore, steps 21, 22, 23 and 25 relate to the Constituent representation with index 1 and steps 26, 27, 28 and 29 to the Word structure with index m.

Aan de hand van het flowdiagram van Fig. 6 zal nu het programmagedeel te 30 gedetailleerd worden uiteengezet.Using the flow diagram of Fig. 6, the program section will now be explained in too much detail.

Middels dit programmagedeelte wordt getracht de aangeboden Woordstructuur in de aangeboden Constituentstructuur onder te brengen. Ofschoon een interpunctie als de "komma" en de voegwoorden "en", "alsmede" en "of" als Woorden worden aangemerkt, en als zodanig in het lexicologisch geheugen ook zijn opgeborgen, nemen zij toch een aparte positie in ten opzichte van andere Woorden bij het parseerproces; zij kunnen worden behandeld als een Memberelement binnen een nieuwe te creëren Parentconstituent van de betreffende Constituent, waarbinnen het proces van inpassen van die voegwoorden of interpunctie eigenlijk zou moeten plaatsvinden. Om deze reden zullen dergelijke voegwoorden en genoemde interpunctie een andere route binnen het proces van inpassen van woorden volgen. Vandaar dat aan het begin van dit programma gedeelte bij stap 50 de vraag gesteld wordt of de Woordcategorie van het aangeboden Woord wordt aangegeven met "Seq". Is dit het geval (Y), dat gaan het programma naar stap 51. Is dit niet het geval (N), dan wordt bij de volgende stap 52 de Woordstructuur alswel de Consti-tuentstructuur naar het Werkgeheugen geschreven. Vervolgens wordt bij stap 53 aan de hand van de informatie over het Categorieveld van het Woord en die van de aangeboden Constituent een zoekactie gevoerd in een als tweede tabellarisch geheugen aan te duiden geheugendeel over grammaticaregels. Zo hoort bijvoorbeeld in de aangegeven vóórheel dgrammati ca bij de Woordcategorie "article" en de aangeboden Constituent "NP" de grammaticaregel "article (NP(det))", hetgeen inhoudt, dat het Woord middels de funtionele label "det“ wordt ingepast in de aangeboden Constituent. Meerdere grammaticaregels van dit type zijn te vinden in bij gevoegde tabel B.By means of this program part an attempt is made to place the offered Word structure in the offered Constituent Structure. Although a punctuation such as the "comma" and the conjunctions "and", "as well as" and "or" are considered as Words, and as such are also stored in lexicological memory, they still occupy a separate position with respect to other Words in the parsing process; they can be treated as a Member element within a new Parent Constituent to be created of the respective Constituent, within which the process of fitting those conjunctions or punctuation should actually take place. For this reason, such conjunctions and said punctuation will follow a different path within the word fitting process. Hence, at the beginning of this program section at step 50, the question is asked whether the Word category of the offered Word is indicated with "Seq". If this is the case (Y), the program goes to step 51. If this is not the case (N), in the next step 52 the Word structure as well as the Constituent structure is written to the Working memory. Then, at step 53, on the basis of the information about the Category field of the Word and that of the offered Constituent, a search is carried out in a memory section to be designated as second tabular memory on grammar rules. For example, the word category "article" and the offered Constituent "NP" have the grammatical rule "article (NP (det))" in the indicated preproduction program, which means that the Word is incorporated into the functional label "det" in the offered Constituent Several grammar rules of this type can be found in attached table B.

Bij de volgende stap 54 wordt de vraag gesteld of de gevraagde grammaticaregel bestaat en derhalve een gegevensveld bezit. Is het gege-vensveld leeg (Y), en de grammaticaregel derhalve niet voorziet in een functioneel Woordlabel, dan wordt het programmagedeelte 30 betreffende de aansluiting van een Woord binnen een Constituent als beëindigd beschouwd. Dit is bijvoorbeeld het geval, wanneer de aansluiting van een werkwoord (verb) in een NP onderzocht dient te worden: dit biedt immers geen oplossing (labellengte = 0). Bij een ontkennend antwoord (N) bij stap 54 wordt bij de volgende stap 55 de vraag gesteld of de grammaticaregel een gegevensveld met slechts één functioneel label bezit. Is dit het geval (Y), zoals bij de regel "article (NP(det))", dan wordt bij het volgende programmagedeelte 56 het aangeboden woord in casu "article" binnen de huidige Constituentstructuur opgenomen.The next step 54 asks whether the requested grammar rule exists and therefore has a data field. If the data field is empty (Y), and the grammar line therefore does not provide a functional Word label, then the program section 30 regarding the connection of a Word within a Constituent is considered to be finished. This is the case, for example, when the connection of a verb (verb) in an NP needs to be investigated: this does not offer a solution (label length = 0). In the negative answer (N) at step 54, the next step 55 asks whether the grammar line has a data field with only one functional label. If this is the case (Y), as with the line "article (NP (det))", then in the next program part 56 the offered word in this case "article" is included within the current constituent structure.

Dit programmagedeelte zal nader aan de hand van Fig. 7 worden uiteengezet. Wordt echter bij stap 55 een ontkennend antwoord gegeven (N) en komen er dus meerdere elementen voor in het gegevensveld, zoals het geval is bij de grammaticaregel "article (PP(det NP))", dan wordt bij de volgende stap 57 voor het laatste element in casu NP een nieuwe Constituent aangemaakt, welke tussen de Parentconstituent PP en het Woord "article" wordt opgenomen.This program section will be explained in more detail with reference to FIG. 7 are explained. However, if a negative answer is given in step 55 (N) and thus multiple elements occur in the data field, as is the case with the grammar rule "article (PP (det NP))", then in the next step 57 last element in this case NP created a new Constituent, which is inserted between the Parent constituent PP and the Word "article".

Deze nieuwe Constituent krijgt daarmee de volgende structuur: Categorieveld: het laatste element van de grammaticaregel in casu NP. Opdrachtveld: Parentconstituent in casu PP.This new Constituent thus has the following structure: Category field: the last element of the grammar rule in this case NP. Mission field: Parent constituent in this case PP.

Structuurelementenveld: NIL, dat wil zeggen het veld bevat geen informatie. ·Structural Element Field: NIL, that is, the field contains no information. ·

Kenmerkenveld: Al de bij deze Constituentcategorie (NP) behorende kenmerken, welke aan het eerste tabellarisch geheugen kunnen worden ontleend. In dit geval: "neuter, inneuter, sing 1, sing 2, sing 3, plu I, plu 2, plu 3, definite, indefinite". Meerdere voorbeelden van dit type zijn opgenomen in tabel A.Characteristic field: All the characteristics belonging to this Constituent Category (NP), which can be derived from the first tabular memory. In this case: "neuter, inneuter, sing 1, sing 2, sing 3, plu I, plu 2, plu 3, definite, indefinite". Several examples of this type are included in Table A.

Stapelveld: NIL, dat wil zeggen het veld bevat geen informatie.Stacking field: NIL, ie the field contains no information.

Foutmeldingsveld: NIL, dat wil zeggen, het veld bevat geen informatie.Error message field: NIL, that is, the field contains no information.

Waarschijnlijkheidsveld: 1.Probability field: 1.

Deze nieuwe Constituentstructuur wordt nu als de huidige consti-tuentstructuur beschouwd.This new constituent structure is now considered to be the current constituent structure.

Bij de volgende stap 58 wordt het laatste element i.c. NP uit de grammaticaregel verwijderd, en keert het programma terug naar stap 55.In the next step 58, the last element i.c. NP is removed from the grammar line, and the program returns to step 55.

Daar in het onderhavige geval slechts één element in de grammaticaregel is overgebleven, zal het programma nu verder gaan met stap 56. Stap 56 wordt uitgevoerd met de huidige woordstructuur en de huidige Constituentstructuur. Opgemerkt moet worden, dat hetgeen als de huidige Constituentstructuur beschouwd wordt, tijdens stap 57 gewijzigd kan zijn.Since in the present case only one element remains in the grammar line, the program will now proceed to step 56. Step 56 is performed with the current word structure and the current constituent structure. It should be noted that what is considered to be the current Constituent structure may have been changed at step 57.

Bij stap 51 wordt de aangeboden Constituent (waarbinnen middels de stappen 25 tot en met 30) de inpassing van een Woord zou moeten plaatsvinden) gekopieerd. Bij de volgende stap 59 wordt nagegaan of de gekopieerde Constituent reeds een element "Conj" binnen het Stapelveld van de Constituent bezit. Is dit het geval (Y), dan is het niet raadzaam een nieuwe Constituent met nogmaals het voorvoegsel "Conj" als Parentconstituent voor de gekopieerde Constituent en het in te passen Woord of Interpunctie in te voeren. Het proces gaat dan naar stap 60, alwaar de gekopieerde Constituent in het Representatiegeheugen wordt geplaatst in afwachting van de daarop volgende stappen 61 tot en met 66, waarbij deze Constituent betrokken zal worden. Wordt de vraag bij stap 59 ontkennend beantwoord (N), dan volgt stap 67, alwaar wordt nagegaan of deze Constituent op dit moment reeds aan de voorwaarden voor het afsluiten van een Constituent voldoet. Een eerste voorwaarde is, dat het als laatste aan het Stapelveld toegevoegde element ook een geschikt element is, waarop de Constituent mag. eindigen. Een tweede voorwaarde is, dat zekere elementen minimaal in het Stapelveld van de Constituent aanwezig moeten zijn.At step 51, the offered Constituent (within which steps 25 to 30 should fit a Word) is copied. In the next step 59 it is checked whether the copied Constituent already has an element "Conj" within the Stack field of the Constituent. If this is the case (Y), it is not advisable to enter a new Constituent again with the prefix "Conj" as Parent Constituent for the copied Constituent and the Word or Punctuation to be inserted. The process then proceeds to step 60, where the copied Constituent is placed in Representation Memory pending subsequent steps 61 to 66, which will involve this Constituent. If the question is answered in the negative (N) at step 59, then step 67 follows, where it is checked whether this Constituent currently already fulfills the conditions for closing a Constituent. A first condition is that the last element added to the Stacking Field is also a suitable element on which the Constituent is allowed. end. A second condition is that certain elements must at least be present in the Stack field of the Constituent.

De eerste voorwaarde is ook te vervangen door een andere voorwaarde, welke aangeeft op welke elementen een Constituent niet mag eindigen.The first condition can also be replaced by another condition, which indicates on which elements a Constituent may not end.

De eerste, vervangende voorwaarde en de tweede voorwaarde worden onderzocht aan de hand van grammaticaregels, welke in een als derde tabellarisch geheugen aan te duiden geheugendeel zijn opgeslagen. Zo duidt de informatie (NP(det Nmod-a)head) aan, dat in een NP als laatste element geen "determiner" of "Nmod-a" mag fungeren, en dat daarin in ieder geval een "head" moet voorkomen. Meerdere voorbeelden zijn in tabel C opgenomen. Wordt aan zulke voorwaarden door een NP niet voldaan, dan wordt de waarde van het Waarschijnlijkheidsveld verlaagd, en wordt er een bijpassende foutcode in het Foutmeldingsveld ; opgenomen. Wordt er wel aan deze condities voldaan, dan vindt bij deze stap geen verdere actie plaats. Door verlaging van de waarschijnlijkheidsfactor wordt de kans groot, dat bij een volgende filterprocedure (bij programmagedeelte 69) deze Constituent toch geëlimineerd wordt.The first, replacement condition and the second condition are examined by means of grammar rules, which are stored in a memory section to be designated as third tabular memory. For example, the information (NP (det Nmod-a) head) indicates that in a NP the last element may not be a "determiner" or "Nmod-a", and that there must in any case be a "head". Several examples are included in Table C. If such conditions are not met by a POI, the value of the Probability field is decreased, and a corresponding error code is entered in the Error Reporting field; included. If these conditions are met, no further action takes place at this step. By lowering the probability factor, it is likely that this constituent will be eliminated in a subsequent filtering procedure (in program section 69).

Bij de volgende stap 63 wordt ten behoeve van de opname van een "komma" of de nevenschikkende voegwoorden naast de gekopieerde Constituent een nieuwe Parentconstituent gegenereerd,' welke dezelfde Categorie-aanduiding krijgt als de gekopieerde Constituent, maar met het voorvoegsel "Conj" erbij. Het Opdrachtveld van de nieuwe Constituent wordt identiek aan die van de gekopieerde Constituent. De overige velden worden ingevuld op gelijke wijze als het geval was bij stap 57.In the next step 63, for the inclusion of a "comma" or the juxtaposed conjunctions next to the copied Constituent, a new Parent Constituent is generated, "which gets the same Category designation as the copied Constituent, but with the prefix" Conj "included. The Command field of the new Constituent becomes identical to that of the copied Constituent. The remaining fields are completed in the same manner as was the case in step 57.

Bij de volgende stap 69 wordt een voorstel gedaan om de gekopieerde constituent in te passen binnen de nieuw gevormde constituent met als functioneel label "Conj". Dit gebeurt onder gebruikmaking van het functionele label "Conj", de gekopieerde Constituentstructuur zelf en de nieuw aangemaakte Parentconstituent. Het bijbehorende program-magedeelte zal aan de hand van de stappen 81 tot en met 89 nader worden uiteengezet, en kan leiden tot toevoeging van een nieuwe Constituentstructuur aan het Representatiegeheugen.In the next step 69, a proposal is made to fit the copied constituent within the newly formed constituent with the functional label "Conj". This is done using the functional label "Conj", the copied Constituent structure itself and the newly created Parent constituent. The associated program portion will be explained in more detail with steps 81 to 89, and may lead to the addition of a new Constituent Structure to the Representation Memory.

Bij de volgende stap 70 wordt nagegaan of er inderdaad een Consti-tuentstructuur aan het Representatiegeheugen is toegevoegd. Is dit niet het geval (N), en is derhalve de kopie-constituent geëlimineerd, dan wordt het inpassingsproces van de "sequencer" als beëindigd beschouwd. Wordt daarentegen de vraag bij stap 70 bevestigend beantwoord, dan volgt stap 61, waarbij de laatst toegevoegde constituent uit het Representatiegeheugen wordt uitgelezen en bovendien hieruit wordt verwijderd. Bij de volgende stap 62 wordt een identiek programma gevolgd als plaatsvond bij stap 56, waarbij echter in het stapelveld het functionele label "seq" wordt voorgesteld. Daar bij stap 62 mogelijk een element aan het Representatiegeheugen kan zijn toegevoegd, wordt bij stap 63 de vraag gesteld of dit inderdaad het geval is. Bij een ontkennend antwoord hierbij (N), wordt dit programmagedeelte als beëindigd beschouwd. Bij een bevestigend antwoord (Y) wordt bij stap 64 het laatst toegevoegde element uit het Representatiegeheugen gehaald en wordt vervolgens bij stap 65 een nieuwe Constituentstructuur aangemaakt op een wijze als reeds bij stap 57 beschreven is. Daarbij wordt het opdrachtveld met een verwijzing naar de Parent-constituent ingevuld. Als Parentconstituent wordt hierbij de bij stap 64 uit het Representatiegeheugen opgehaalde Constituent beschouwd, welk nu als huidige Constituent wordt beschouwd.In the next step 70 it is checked whether a Constituent structure has indeed been added to the Representation memory. If not (N), and therefore the copy constituent is eliminated, the sequencer insertion process is considered terminated. If, on the other hand, the question is answered affirmatively at step 70, then step 61 follows, in which the last added constituent is read from the Representation memory and moreover is removed from it. The next step 62 follows an identical program as that which took place in step 56, however, in the stack field the functional label "seq" is represented. Since an element may have been added to the Representation memory at step 62, the question is asked at step 63 whether this is indeed the case. In the case of a negative answer (N), this program part is considered to have ended. In the affirmative answer (Y), the last added element is extracted from the Representation memory in step 64 and a new Constituent structure is then created in step 65 in a manner already described in step 57. The assignment field is filled in with a reference to the Parent constituent. As the Parent Constituent, the Constituent retrieved from Representation Memory in step 64 is now considered to be the current Constituent.

Het categorieveld krijgt dezelfde waarde als het categorieveld van de memberconstituent dat het functionele label "conj" bezit binnen de huidige constituent.The category field gets the same value as the category field of the member constituent that has the functional label "conj" within the current constituent.

De overige velden worden ingevuld gelijk het geval was bij stap 57.The other fields are filled in as was the case in step 57.

Bij de volgende stap 66 wordt de zojuist aangemaakte nieuwe consti-tuentstructuur toegevoegd aan het Representatiegeheugen, waarna het programma met betrekking tot het inpassen van een woord binnen een constituent als beëindigd wordt beschouwd.In the next step 66, the newly created constituent structure is added to the Representation Memory, after which the word fit program within a constituent is considered terminated.

Fig. 7 geeft een flow-diagram van een gedetailleerd programma weer, waaraan tijdens de programmagedeeltes 56 en 62 gerefereerd werd.Fig. 7 depicts a flow chart of a detailed program referenced during program sections 56 and 62.

Bij stap 71 wordt een kopie van de huidige Constituentstructuur, dat wil zeggen de structuur van de constituent, waarin het betreffendeAt step 71, a copy of the current Constituent Structure, that is, the structure of the constituent, contains the relevant

Woord ondergebracht moet worden, in het werkgeheugen geschreven.Word must be lodged, written in working memory.

Bij de volgende stap 72 wordt aan het Stapelveld van deze kopie het functionele label, dat bij stap 53 en 56, resp. 62 is verkregen, toegevoegd. Daarna wordt bij stap 73 een combinatie van struc-tuurelementen, te weten het functionele label en de in te passen Woordstructuur als labelpaar aan het Structuurelementenveld toegevoegd. In de praktijk zal het adres van de Woordstructuur worden toegevoegd in plaats van de Woordstructuur zelf.At the next step 72, the Functional Label, which at steps 53 and 56, resp. 62 is obtained, added. Then, in step 73, a combination of structural elements, namely the functional label and the Word structure to be fitted, is added as a label pair to the Structural element field. In practice, the address of the Word Structure will be added instead of the Word Structure itself.

Vervolgens wordt bij stap 74 in een als vierde tabellarisch geheugen aan te duiden geheugendeel bij de categorie van de huidige Constituent en het functionele label van het Woord de bijbehorende reeks kenmerken geselecteerd. Ieder element uit deze reeks Kenmerken, dat niet op het Kenmerkenveld van de Woordstructuur voorkomt en wel op het Kenmerken-veld van Constituentstructuur, wordt van het Kenmerkenvel d van de Constituentstructuur verwijderd.Then, in step 74, in a memory section to be designated as the fourth tabular memory, the corresponding set of features is selected for the category of the current Constituent and the functional label of the Word. Any element from this set of Attributes, which does not appear on the Attributes field of the Word Structure, and which does occur on the Attributes field of Constituent Structure, is removed from the Attribute Sheet d of the Constituent Structure.

Ter illustratie van hetgeen hiervoor vermeld is, wordt het volgende voorbeeld gegeven: bij "NP" en "determiner" behoort volgens het vierde tabellarisch geheugen (zie tabel D) de reeks kenmerken: "neuter, inneuter, sing 1, sing 2, sing 3, plu 1, plu 2, plu 3, definite, indefinite". Stel dat de Constituentstructuur met categorie "NP" de kenmerken sing 1, sing 2, sing 3, plu 1, plu 2, plu 3, neuter, inneuter" bezit. Daar nu de Woordstructuur bijvoorbeeld in het Kenmerkenvel d de kenmerken "neuter, definite, sing 3" bezit (zoals bijvoorbeeld voor het lidwoord "het" geldt), moeten de kenmerken sing 1, sing 2, plu 1, plu 2, plu 3 en inneuter verwijderd worden uit het Ken merkenvel d van de Constituentstructuur.To illustrate what has been mentioned above, the following example is given: for "NP" and "determiner" according to the fourth tabular memory (see table D) the series of characteristics belongs: "neuter, inneuter, sing 1, sing 2, sing 3 , plu 1, plu 2, plu 3, definite, indefinite ". Suppose that the Constituent structure with category "NP" has the characteristics sing 1, sing 2, sing 3, plu 1, plu 2, plu 3, neuter, inneuter. Since now the Word structure in the Attribute sheet d has the characteristics "neuter, definite , sing 3 "(as is the case for the article" het "), the features sing 1, sing 2, plu 1, plu 2, plu 3 and neuter must be removed from the Feature sheet d of the Constituent structure.

Bij de volgende stap 75 worden desgewenst enige controle-operaties uitgevoerd, hetgeen aan de hand van een als vijfde tabellarisch geheugen aan te duiden geheugendeel (zie tabel E) geschiedt. Daarbij wordt voor ieder constituent-type (constituent categorie) gekeken naar onjuiste combinaties, onjuiste volgorde en onjuist aantal functionele labels in het Stapelveld. Zo is het voorkomen in een constituent NP van twee functionele labels "head" achter elkaar op te vatten als een onjuiste combinatie, de functionele labels "Nmod-S" en daarna een "head" als een combinatie met een onjuiste volgorde, en meer dan één functioneel label "determiner" in een NP als een onjuist aantal functionele labels voor zo'n constituent. Een onjuiste volgorde van twee functionele labels is voorts aanwezig als een "head" gevolgd wordt door een "Nmod-A". Hierbij moet rekening worden gehouden, dat in de Constituentstructuren de laatst toegevoegde elementen vooraan komen te staan.In the next step 75, if desired, some control operations are carried out, which is done on the basis of a memory part to be designated as fifth table memory (see table E). In addition, for each constituent type (constituent category), incorrect combinations, incorrect order and the number of functional labels in the Stapelveld are examined. For example, the occurrence in a constituent NP of two functional labels "head" can be interpreted consecutively as an incorrect combination, the functional labels "Nmod-S" and then a "head" as a combination with an incorrect order, and more than one functional label "determiner" in an NP as an incorrect number of functional labels for such a constituent. Furthermore, an incorrect order of two functional labels is present when a "head" is followed by an "Nmod-A". It must be taken into account here that in the Constituent Structures the last added elements are placed at the front.

Oe tabel E geeft deze controles aan met: "NP ((head head)) ((head Nmod-S) (Nmod-a head)) ((det 1))".Table E indicates these checks with: "NP ((head head)) ((head Nmod-S) (Nmod-a head)) ((det 1))".

Deze tabel kan worden uitgebreid voor andere constituenten en voorts in het aantal testregels per constituent. Een tweede controle geschiedt voor iedere constituentcategorie door onjuiste combinaties in het kenmerkenveld op te sporen aan de hand van tabel F, welke als zodanig in een als zesde tabellarisch geheugen aan te merken geheugen-deel is opgeslagen. Zo is in een NP de combinatie "inneuter" en "neuter" behorende bij de woordcombinatie "de huis" als onjuist aan te merken. Dit uit zich in het ontbreken van zowel neuter als inneuter in het Kenmerkenveld.This table can be extended for other constituents and also in the number of test lines per constituent. A second check is made for each constituent category by detecting incorrect combinations in the characteristic field using table F, which is stored as such in a memory part to be designated as sixth tabular memory. In a NP, for example, the combination "inneuter" and "neuter" associated with the word combination "de huis" can be regarded as incorrect. This is reflected in the absence of both neuter and neuter in the Characteristics field.

Voorts kan gekeken worden op noodzakelijk aanwezig geachte combinaties in het Kenmerkenveld.It is also possible to look at combinations deemed necessary in the Characteristics field.

Iedere constituent dient in het kenmerkenveld tenminste één gesanctioneerde combinatie van Kenmerklabels te vertonen.Each constituent must display at least one sanctioned combination of Feature labels in the feature field.

Zo omvat het Kenmerkenveld van de NP-constituent "een groot huis" de combinatie "indefinite, sing 3, adj-not-infl., adj. enumerative, neuter" terwijl een beproefde standaardcombinatie bijvoorbeeld is (indefinite, adj.-not-inflected, neuter), welke in het genoemde Kenmerkenvel d wordt aangetroffen. De NP-constituent "het grote huis" heeft het Kenmerkenveld (definite, si ng 3, adj. inflected, adj. enumerative, neuter), waarin de standaardcombinatie (definite, adj. inflected) wordt aangetroffen. Aldus kan voor een NP-constituent een reeks van standaard-combinaties van Kenmerklabels worden opgesteld. Zo'n reeks van standaardcombinaties voor een NP zou kunnen zijn: NP ((definite, adj.-inflected) (plu 3, adj.-inflected) (inneuter, adj.-inflected) (neuter, indefinite, adj.-not inflected)). Ook voor andere typen constituenten zijn dergeljike standaardcombinaties op te stellen, welke tesamen in het als zesde tabellarische geheugen aan te merken geheugendeel zijn opgeslagen (zie tabel F).For example, the Characteristic field of the NP constituent "a big house" includes the combination "indefinite, sing 3, adj-not-infl., Adj. Enumerative, neuter" while a proven standard combination is for example (indefinite, adj.-not-inflected , neuter), which is found in the said Feature sheet d. The NP constituent "the big house" has the Characteristics field (definite, si ng 3, adj. Inflected, adj. Enumerative, neuter), in which the standard combination (definite, adj. Inflected) is found. Thus, for a NP constituent, a series of standard combinations of Feature labels can be drawn up. Such a set of standard combinations for an NP could be: NP ((definite, adj.-inflected) (plu 3, adj.-inflected) (inneuter, adj.-inflected) (neuter, indefinite, adj.-not inflected )). Such standard combinations can also be drawn up for other types of constituents, which are stored together in the memory part to be regarded as the sixth tabular memory (see Table F).

Een constituent moet dan aan tenminste één der bijbehorende standaardcombinaties voldoen. Is dit niet het geval dan dient de waarde van het waarschijn!ijkheidsveld bij die constituent te worden verlaagd. Zo heeft de foutieve NP-constituent "een grote huis" als Kenmerkenvel d: (indefinite, sing 3, adj. inflected, adj.-emumerative, neuter), waarin geen der beproefde standaard-combinaties wordt teruggevonden.A constituent must then satisfy at least one of the associated standard combinations. If not, the value of the probability field for that constituent must be decreased. For example, the erroneous NP constituent has "a big house" as Attribute sheet d: (indefinite, sing 3, adj. Inflected, adj.-emumerative, neuter), in which none of the tried-and-tested standard combinations are found.

Tevens kan hier in het Foutmeldingsveld een foutcode worden opgenomen, welke bij eventuele presentatie van de geparseerde zin op het beeldscherm eveneens zichtbaar wordt gemaakt.An error code can also be entered here in the Error message field, which is also made visible on the screen when the parsed sentence is presented.

Bij de volgende stap 76 wordt de vraag gesteld of de waarde van het Waarschijn!ijkheidsveld van de kopie-constituent groter of gelijk is aan de drempelwaarde. Bij een ontkennende beantwoording van de vraag (N) wordt de kopie-constituent geëlimineerd, en wordt de te beschouwen stap 56 c.q. 62 in Fig. 6 als beëindigd beschouwd.In the next step 76, the question is asked whether the value of the Probability calibration field of the copy constituent is greater than or equal to the threshold value. In the negative answer to question (N), the copy constituent is eliminated, and the step 56 or 62 to be considered in FIG. 6 considered terminated.

Bij een bevestigend antwoord (Y) op de vraag van stap 76 wordt bij de volgende stap 77 de vraag gesteld, of de Constituent met het oog op de huidige drempelwaarde nog afgesloten kan worden. Hierbij worden de mogelijke functionele labels voor de huidige Constituent binnen de Constituent op het Opdrachtsveld getest. Als de onderhavige Constituent in het Stapelveld meer dan één functioneel label bezit (Y), is deze test in een vorig stadium al met een positief resultaat doorlopen, en behoeft derhalve niet opnieuw te worden doorlopen. Zodoende wordt evenals in de situatie dat blijkt dat de Constituent nog afgesloten kan worden bij de huidige drempelwaarde verder gegaan met stap 78. Anders kan het in Fig. 7 weergegeven programmagedeelte dan als beëindigd worden beschouwd (N). Deze stap kan als facultatief worden aangemerkt, daar zij niet noodzakelijk is voor het parseerproces, maar eerder versnellend werkt op het verkrijgen van het eindresultaat. Bij de volgende stap 78 wordt deze constituent in het Representatie-geheugen opgenomen.In the affirmative answer (Y) to the question of step 76, the next step 77 asks whether the Constituent can still be closed in view of the current threshold value. The possible functional labels for the current Constituent within the Constituent are tested on the Command field. If the present Constituent in the Stapelveld has more than one functional label (Y), this test has already been passed with a positive result in a previous stage, and therefore does not need to be repeated. Thus, as in the situation where it appears that the Constituent can still be terminated at the current threshold, proceed to step 78. Otherwise, in FIG. 7 displayed program section will then be considered terminated (N). This step can be considered optional, since it is not necessary for the parsing process, but rather accelerates the obtaining of the end result. In the next step 78, this constituent is included in Representation memory.

Aan de hand van het flow-diagram van Fig. 8 volgt nu een gedetailleerde uiteenzetting van het programmagedeelte 69.Using the flow diagram of Fig. 8 now follows a detailed explanation of the program section 69.

Daar dit programmagedeelte veel overeenkomsten vertoont met de programmastappen 71 tot en met 78, zal hiernaar bij sommige program-mastappen verwezen worden.Since this program portion has many similarities with program steps 71 to 78, this will be referred to in some program steps.

Bij stap 79 wordt van de Parentconstituent een kopie gemaakt ten behoeve van het werkgeheugen, en als zodanig ook bij de volgende stappen benut.At step 79 a copy is made of the Parent constituent for the purpose of the working memory, and as such is also used in the following steps.

Bij de volgende stap 80 wordt de waarschijnlijkheidsfactor van de gekopieerde constituent aangepast, en wel door dieIn the next step 80, the probability factor of the copied constituent is adjusted by that

Waarschijn!ijkheidsfactor te corrigeren met de Waarschijnlijkheidsfac-tor van de door de Parentconstituent gedomineerde constituent (member constituent), bijvoorbeeld door vermenigvuldiging van beide waarschijnlijkheidsfactoren. Voor de bepaling van de nieuwe waarde van de waarschijnlijkheidsfactor kan ook gebruik worden gemaakt van een andere geschikte functie f (x,y) uit beide waarschijnlijkheidsfactoren x en y.Probability to be corrected by the Probability factor of the constituent dominated by the Parent constituent (member constituent), for example by multiplying both probability factors. Another suitable function f (x, y) from both probability factors x and y can also be used to determine the new value of the probability factor.

Bij stap 81 wordt het functionele label, zoals genoemd is bij stap 69 en bij de later te beschrijven stappen 99 en 104 in het Stapelveld van de kopieconstituent opgenomen.At step 81, the functional label, as mentioned at step 69 and at steps 99 and 104 to be described later, is included in the Stack field of the copy constituent.

Vervolgens wordt bij stap 82 een combinatie-paar van achtereenvolgens het functionele label en de gedomineerde constituent aan het struc-tuurelelementenveld van de kopieconstituent toegevoegd.Then, in step 82, a combination pair of the functional label and the dominated constituent is added successively to the structural element field of the copy constituent.

Bij stap 83 wordt in het vierde tabellarische geheugen bij de categorie-aanduiding van de Parentconstituent en het functionele label van de door deze constituent gedomineerde Memberconstituent een Ken-merkenlijst doorlopen (zie tabel 0). Alle Kenmerken, die zowel in deze alswel in de Kenmerkenlijst van de bij stap 79 gekopieerde constituent voorkomen, maar niet in de Kenmerken!!jst van de Member-constituent, worden uit het Kenmerkenveld van de gekopieerde constituent verwijderd. Bovendien worden alle in de tweede Kenmerkenlijst uit de tabel aanwezige kenmerken, voorzover ze al niet aanwezig zijn in het Kenmerkenvel d van de gekopieerde constituent, toegevoegd aan het Kenmerkenvel d van de gekopieerde constituent. Deze stap is vergelijkbaar met stap 74.At step 83, in the fourth tabular memory, a Characteristic list is passed through at the category designation of the Parent constituent and the functional label of the Member constituent dominated by this constituent (see table 0). All Attributes, which appear in this as well as in the Attributes list of the constituent copied in step 79, but not in the Attributes !! jst of the Member constituent, are removed from the Attributes field of the copied constituent. In addition, all attributes contained in the second Attribute List from the table, if not already present in the Attribute Sheet d of the copied constituent, are added to the Attribute Sheet d of the copied constituent. This step is similar to step 74.

Bij de stap 84 wordt een programmagedeelte afgewerkt als ook beschreven is bij stap 75.At step 84, a program section is executed as also described at step 75.

De daarop volgende stappen 85, 86 en 87, welke van toepassing zijn op de gekopieerde constituent, zijn vergelijkbaar met de stappen 76, 77 en 78, met dien verstande, dat het hier gaat om een Constituent en niet om een Woord.The subsequent steps 85, 86 and 87, which apply to the copied constituent, are similar to steps 76, 77 and 78, provided that this is a Constituent and not a Word.

Aan de hand van het flow-diagram van Fig. 9 volgt nu een gedetailleerde uiteenzetting van het programmagedeelte 37 met betrekking tot het afsluiten van de huidige Constituentstructuur.Using the flow diagram of Fig. 9 now follows a detailed explanation of the program section 37 regarding the termination of the current Constituent structure.

Bij stap 88 wordt de huidige Constituentstructuur, welke bij de voorafgaande stap 36 werd aangeboden vanuit het Tijdelijk geheugen (zie Fig. 3), vanuit het Tijdelijk Schakel geheugen (zie Fig. 4), vanuit het programmagedeelte 3Ü (zie Fig. 5) of vanuit de nog nader te bespreken programmagedeeltes 103 en 108 in Fig. 9 toegevoegd aan het Representatiegeheugen. Daar bij stap 65 mogelijk een Constituentstruc-tuur is gecreëerd waarbij het Stapelveld geen element bevat, is het niet toegestaan een dergelijke constituent af te sluiten. Dientengevolge wordt bij stap 89 de vraag gesteld of het Stapelveld leeg is. Bij een bevestigend antwoord (Y) wordt dit programmaonderdeel als beëindigd beschouwd. Bij een ontkennend antwoord (NI) volgt bij stap 90 de vraag of het Opdrachtveld van de huidige Constituent een waarde bevat. Is dit niet het geval (N), dan is de Constituent een Top-constituent, en komt dan niet voor afsluiting in aanmerking. Bij een Topconstituent wordt het programmagedeelte van Fig. 9 dan als beëindigd beschouwd. Is het antwoord bij stap 90 bevestigend (Y), dan volgt stap 91, alwaar van de huidige Constituent een Kopie-Constituent wordt gemaakt. Bij de volgende stap 92 wordt de Categorie van de Kopie-Constituent, hierna te noemen Memberconstituent vastgesteld. Daarna wordt bij stap 93 de categorie van de Parentconstituent vastgesteld, en wel aan de hand van het Opdrachtenveld van de Memberconstituent. Vervolgens wordt met behulp van de in stappen 92 en 93 verkregen gegevens over de categorie van de Memberconstituent en diens parentconsti tuent en aan de hand van een als zevende tabellarisch geheugen aan te merken geheugendeel bij stap 94 de bijbehorende grammaticaregel geselecteerd (zie tabel G).At step 88, the current Constituent structure, which was presented in the previous step 36, is from the Temporary memory (see Fig. 3), from the Temporary Switch memory (see Fig. 4), from the program section 3Ü (see Fig. 5) or from the program sections 103 and 108 in FIG. 9 added to the Representation Memory. Since at step 65 a Constituent structure may have been created in which the Stacking field does not contain an element, it is not allowed to close such a constituent. As a result, at step 89 the question is asked whether the Stacking Field is empty. In the affirmative answer (Y), this program part is considered to have ended. In the case of a negative answer (NI), step 90 asks whether the Command field of the current Constituent contains a value. If this is not the case (N), the Constituent is a Top constituent, and is not eligible for closure. At a Top Constituent, the program portion of Fig. 9 then considered terminated. If the answer at step 90 is affirmative (Y), then step 91 follows, where a Copy Constituent is made of the current Constituent. In the next step 92, the Category of the Copy Constituent, hereinafter referred to as Member Constituent, is determined. Then, at step 93, the category of the Parent Constituent is determined, based on the Member Constituent's Command Field. Subsequently, using the data obtained in steps 92 and 93 on the category of the Member constituent and its parent constituent and on the basis of a memory part to be designated as seventh tabular memory, the corresponding grammar rule is selected in step 94 (see table G).

Is de categorie van de Memberconstituent een NP en is de categorie van de Parentconstituent een "S", dan volgt uit de grammaticaregel NP (S, FNP) dat de Memberconstituent een functioneel label fNP kan worden toegewezen. In principe kan op die plaats in de tabel ook meer dan één element staan.If the category of the Member Constituent is an NP and the category of the Parent Constituent is an "S", it follows from the grammar rule NP (S, FNP) that the Member Constituent can be assigned a functional label fNP. In principle, there can also be more than one element in that place in the table.

Vervolgens wordt bij stap 95 een filterprocedure uitgevoerd, waarbij condities betreffende de elementen in het Stapelveld worden getoetst. De toetsingscriteria zijn opgeslagen in het derde tabellarische geheugen, waarbij de toetsing zelf geheel overeenkomstig die van stap 67 zal verlopen (zie tabel C).Then, at step 95, a filtering procedure is performed, in which conditions regarding the elements in the Stacking Field are tested. The assessment criteria are stored in the third table memory, the assessment itself being entirely in accordance with that of step 67 (see table C).

Bij de volgende stap 96 wordt onderzocht of de waarde in het Waarschijnlijkheidsveld groter of gelijk is aan de drempelwaarde. Het is immers mogelijk, dat bij stap 95 de waarde van het Waarschijnlijkheidsveld verlaagd is. Is dit niet het geval (N), dan wordt het programmagedeelte 37 als beëindigd beschouwd. In het andere geval (Y) volgt stap 97, waarbij de vraag wordt gesteld of de bij stap 94 geselecteerde grammaticaregel functionele labels bevat. Is dit niet het geval (N), dan is afsluiting van de huidige Constituent niet mogelijk, en dient het proces ter afsluiting van de huidige Constituent als beëindigd te worden beschouwd. Wordt echter een bevestigend antwoord (Y) op deze vraag gegeven, dan is er tenminste een zinvolle afsluitingsprocedure mogelijk. Wel dient daarbij te worden nagegaan, of er één dan wel meer functionele labels bij stap 94 zijn voorgesteld. Daartoe wordt bij de volgende stap 98 de vraag gesteld of er precies één functioneel label bij stap 94 is voorgesteld. Bij een bevestigend antwoord (Y) hierop volgt stap 99, en bij een ontkennend antwoord (N) stap 100. Bij het programmagedeelte 99 wordt voor de huidige Constituent voorgesteld om als element met het functionele label als aangegeven is in de grammaticaregel van stap 94, binnen de Constituentstructuur van de Parentconstituent te fungeren. De Parent-constituent is daarbij aangegeven in het Opdrachtveld van de huidige Constituent. De huidige Constituent met het daarbij voorgestelde functionele label wordt dan in het Memberveld van de Parentconstituent opgenomen, hetgeen uitvoerig is besproken bij de stappen 79 tot en met 83. Bij stap 101 wordt nagegaan of tijdens het programmagedeelte 99 (gelijk uiteengezet is bij stap 78) een element aan het Represen-tatiegeheugen is toegevoegd. Bij een ontkennend antwoord (N) wordt dit programmagedeelte als beëindigd beschouwd. Bij een bevestigend antwoord (Y) op de vraag bij stap 101 wordt verder gegaan met stap 102, alwaar het laatst toegevoegde element uit het Representatiegeheugen wordt opgehaald, uit het Representatiegeheugen wordt verwijderd en vervolgens bij stap 103 wordt onderworpen aan de procedure met betrekking tot het afsluiten van dit element zoals aangegeven is bij de met nummer 88 aanvangende stappenreeks. Bij stap 100 wordt de huidige Constituentstructuur gekopieerd, waarbij de Kopie verder als de huidige Constituent wordt beschouwd. Bij programmagedeelte 104 wordt voor de huidige Constituent voorgesteld om als element met het functionele label dat als eerste element aangegeven is in de grammaticaregel van stap 94, binnen de Constituentstructuur van de Parentconstituent te fungeren. Oe huidige Constituent met het daarbij voorgestelde functionele label wordt dan in het Memberveld van de bewuste Parentconstituent opgenomen, hetgeen uitvoerig is uiteengezet bij de stappen 79 tot en met 87. Bij stap 105 wordt het eerste element (le functioneel element) uit de grammaticaregel verwijderd.In the next step 96 it is examined whether the value in the Probability field is greater than or equal to the threshold value. After all, it is possible that in step 95 the value of the Probability field has decreased. If not (N), the program section 37 is considered to be finished. Otherwise (Y), step 97 follows, asking whether the grammar line selected in step 94 contains functional labels. If this is not the case (N), closing of the current Constituent is not possible, and the process of closing the current Constituent should be considered as ended. However, if an affirmative answer (Y) is given to this question, then at least a meaningful closure procedure is possible. It must however be ascertained whether one or more functional labels have been proposed at step 94. To this end, the next step 98 asks whether exactly one functional label was proposed at step 94. In the affirmative answer (Y), step 99 follows, and in the negative answer (N), step 100. In the program section 99, for the current Constituent it is proposed that as an element with the functional label as indicated in the grammar line of step 94, to function within the Constituent Structure of the Parent Constituent. The Parent constituent is indicated in the Command field of the current Constituent. The current Constituent with the proposed functional label is then included in the Memer Field of the Parent Constituent, which has been discussed in detail at steps 79 to 83. At step 101, it is determined whether during the program portion 99 (as explained in step 78) an element has been added to the Represents memory. In the case of a negative answer (N), this program part is considered to have ended. In the affirmative answer (Y) to the question at step 101, proceed to step 102, where the last added element is retrieved from the Representation memory, removed from the Representation memory, and then subjected to the procedure regarding the step 103 at terminating this element as indicated in the sequence of steps starting with number 88. At step 100, the current Constituent structure is copied, with the Copy further considered as the current Constituent. At program portion 104, it is proposed for the current Constituent to function as an element with the functional label indicated as the first element in the grammar line of step 94, within the Constituent Structure of the Parent Constituent. The current Constituent with the proposed functional label is then included in the Memer field of the Parent constituent in question, which is explained in detail at steps 79 to 87. At step 105, the first element (the functional element) is removed from the grammar line.

Vervolgens wordt bij stap 106 nagegaan of tijdens het program-magedeelte 104 het aantal Constituenten in het Representatiegeheugen is toegenomen, hetgeen gebeurd zou kunnen zijn bij het program-magedeelte 104. Bij een ontkennend antwoord (N) op deze vraag wordt direct teruggekeerd naar stap 98 om aan de hand van de gezuiverde grammaticaregel het afsluitingsproces van de huidige Constituent uit te voeren. Bij een bevestigend antwoord (Y) op de vraag van stap 106 wordt in de volgende stap 107 de laatst toegevoegde Constituentstruc-tuur uit het Representatiegeheugen opgehaald en daaruit verwijderd en als huidige Constituent beschouwd, waarop in de volgende stap 108 het afsluitingsprogramma als aangegeven is door de met nummer 88 aanvangende serie programmastappen van toepassing is. Na stap 108 keert het programma terug naar stap 98.Next, it is checked at step 106 whether during the program part 104 the number of Constituents in the Representation memory has increased, which could have happened with the program part 104. In case of a negative answer (N) to this question, it is immediately returned to step 98 to perform the closing process of the current Constituent using the purified grammar rule. In the affirmative answer (Y) to the question of step 106, in the next step 107, the last added Constituent structure is retrieved from Representation memory and removed from it and considered as the current Constituent, whereupon in the next step 108 the termination program is indicated by the series of program steps starting with number 88 applies. After step 108, the program returns to step 98.

Daar tijdens het voorgaande programma mogelijk tijde!ijke labels zijn toegekend, worden in aansluiting op het eigenlijke parseerproces in het navolgende programmagedeelte deze tijdelijke labels vervangen door definitieve labels. Zo zullen voor de hier gebruikte Voorbeeldgram-matica in alle Constituenten, welke in het Representatiegeheugen zijn opgeslagen, de labels fNP, daar waar mogelijk is, vervangen worden door de definitieve labels Onderwerp (Subject), Lijdend voorwerp (Object), Meewerkend voorwerp (Indirect Object) en Predicaat (Predicate), welke respectievelijk onder de volgende afkortingen worden gevoerd: Subj., Obj., Indobj., Pred.Since temporary labels may have been assigned during the previous program, these temporary labels will be replaced by final labels in connection with the actual parsing process in the following program section. For example, for the Sample Grammar used here in all Constituents stored in Representation Memory, the fNP labels will be replaced, where possible, by the final labels Subject (Subject), Direct Object (Object), Indirect Object Object) and Predicate (Predicate), which are respectively used under the following abbreviations: Subj., Obj., Indobj., Pred.

Hiertoe wordt bij stap 109 de tel stand van de m*-teller, waarmee een verwijzing naar het rangnummer van een Constituent wordt verkregen, teruggesteld naar 0, waarna bij stap 110 de tel stand van deze teller met "1" wordt verhoogd.To this end, at step 109, the count position of the m * counter, with which a reference to the rank number of a constituent is obtained, is reset to 0, after which at step 110 the count position of this counter is increased by "1".

Bij stap 111 wordt nagegaan of de tel stand van de m*-teil er de maximale waarde (m*max) overeenkomende met het aantal Constituenten reeds heeft overschreden. Bij een bevestigend antwoord (Y) wordt dit programmagedeelte als beëindigd beschouwd. Bij een ontkennende beantwoording (N) van de vraag bij stap 111 wordt bij de volgende stap 112 de vraag gesteld of het Opdrachtveld van de m*-de Constituentstructuur een gegeven bevat. Wordt de vraag bevestigend beantwoord (Y), dan betreft het geen Topconstituent, en is het dus niet nodig, dat definitieve labels worden toegekend.In step 111 it is checked whether the count of the m * -teil has already exceeded the maximum value (m * max) corresponding to the number of constituents. In the affirmative answer (Y), this program section is considered to have ended. In the negative answer (N) of the question at step 111, the next step 112 asks whether the Command field of the m * -th Constituent structure contains a data. If the question is answered in the affirmative (Y), then it is not a Top Constituent, and it is therefore not necessary for definitive labels to be assigned.

Wordt de vraag ontkennend beantwoord (N), dan volgt programmafase 113 met betrekking tot het vervangen van de voorlopige functionele labels door definitieve, waarna het programma terugkeert naar stap 110.If the question is answered in the negative (N), program phase 113 follows with regard to replacing the provisional functional labels with final ones, after which the program returns to step 110.

Aan de hand van Fig. 11 volgt nu een nadere uiteenzetting van programmafase 113.With reference to Fig. 11 now follows a more detailed explanation of program phase 113.

Bij stap 114 wordt een n-teller, waarvan de telstand verwijst naar het rangnummer van het memberpaar in het Structuurelementenveld van de huidige Constituent teruggesteld naar 0. Daarna wordt bij stap 115 de telstand van deze teller met "1" verhoogd.At step 114, an n-counter, the count of which refers to the rank number of the member pair in the Structural Elements field of the current Constituent, is reset to 0. Then, at step 115, the count of this counter is increased by "1".

Bij de volgende stap 116 wordt de vraag gesteld of de telstand van de n-teller reeds de maximale waarde (nmax), welke overeenkomt met het aantal memberparen in dat Structuurelementenveld, heeft overschreden. Is dit het geval (Y), dan wordt het programmagedeelte 113 als beëindigd beschouwd, en gaat het programma terug naar stap 110. Bij een ontkennende beantwoording (N) van de vraag bij stap 116 wordt bij de volgende stap 117 het tweede element van het door de telstand van de n-teller aan te wijzen paar van de huidige Constituent bepaald, en als huidige Constituent aangemerkt. Bij de volgende stap 118 wordt de vraag gesteld of het als huidige Constituent aangemerkte element inderdaad wel een Constituent is, en geen Woord. Is dit niet het geval (N), dan gaat het programma terug naar stap 115. Is dit wel het geval (Y), dan wordt de huidige Constituent op diens structuur en derhalve op de inhoud van het Stapelveld van de huidige, dat wil zeggen bij stap 117 aangewezen Constituent, onderzocht. Dit impliceert dat eenzelfde procesgang als beschreven is bij de stappen 114, 115, 116, 117 en 118, nu ook weer binnen het programmagedeelte 119 met de stappen 120, 121, 122, 123 en 124 moet worden uitgevoerd, maar nu op de Memberparen van het als huidige Constituent aangemerkte element. Mocht één van de Memberparen een tweede element beziten, dat niet een Woord maar een Constituent representeert (hetgeen bijvoorbeeld het geval kan zijn bij relatieve bijzinnen), dan dient de volgende stap 125 eveneens vervangen te worden door een programmagedeelte als aangemerkt is met aanduiding 119. In deze beschrijving wordt de situatie onderzocht, waarbij bij alle Memberparen het tweede element bij stap 118 als Woord wordt aangemerkt. Het gevolg van deze aanname is, dat stap 119 eigenlijk weggelaten kan worden en het programma bij stap 126 terecht komt, alwaar de vraag wordt gesteld of één van de elementen op het Stapelveld van de bij stap 118 geselecteerde Constituent, te weten het tweede element van het n-de paar, een tijdelijk label is. Is dit niet het geval (N) dan keert het programma terug naar stap 115. Bij een bevestigend antwoord (Y) op de vraag bij stap 126 volgt bij stap 127 een programmagedeelte, alwaar het tijdelijke functionele label vervangen wordt door een definitief functioneel label. Daarbij wordt de inhoud van het Stapelveld van de m*-de Constituentstructuur bekeken, en wordt deze inhoud vergeleken met zekere grammaticaregels, welke in een als achtste tabellarisch geheugen aan te merken geheugengedeelte zijn opgeslagen. Een dergelijke regel zou kunnen zijn: {(fNP, Vfin-main> Smod, fNP, Endmark)(Subj., Obj.), hetgeen inhoudt dat de eerste fNP wordt vervangen door een label "Onderwerp" en de tweede fNP door een label "Lijdend Voorwerp" (zie tabel H).In the next step 116, the question is asked whether the count of the n-counter has already exceeded the maximum value (nmax), which corresponds to the number of member pairs in that Structural element field. If this is the case (Y), the program portion 113 is considered terminated, and the program goes back to step 110. If the question (N) is answered negatively at step 116, the second element of the next step 117 becomes the pair of the current Constituent to be determined by indicating the count of the n-counter, and designated as the current Constituent. In the next step 118, the question is asked whether the element designated as the current Constituent is indeed a Constituent, and not a Word. If not (N), the program goes back to step 115. If it is (Y), the current Constituent becomes on its structure and therefore on the contents of the Stack field of the current one, i.e. Constituent designated at step 117, examined. This implies that the same procedure as described at steps 114, 115, 116, 117 and 118 must now also be carried out again within the program part 119 with steps 120, 121, 122, 123 and 124, but now on the Member pairs of the element designated as the current Constituent. Should one of the Member pairs have a second element, which does not represent a Word but a Constituent (which may be the case with relative clauses, for example), then the next step 125 must also be replaced by a program section marked with indication 119. In this description, the situation is examined, in which the second element is designated as Word for all Member pairs at step 118. The consequence of this assumption is that step 119 can actually be omitted and the program ends up at step 126, where the question is asked whether one of the elements on the Stack field of the Constituent selected at step 118, namely the second element of the nth pair, is a temporary label. If this is not the case (N), the program returns to step 115. In case of an affirmative answer (Y) to the question in step 126, a program part follows in step 127, where the temporary functional label is replaced by a definitive functional label. The contents of the Stack field of the m * -th Constituent structure are thereby examined, and this content is compared with certain grammar rules, which are stored in a memory section to be designated as eighth tabular memory. Such a line could be: {(fNP, Vfin-main> Smod, fNP, Endmark) (Subj., Obj.), Which means that the first fNP is replaced by a label "Subject" and the second fNP by a label "Suffering Object" (see Table H).

Komen een aantal oplossingen voor bij een bepaalde gegevensinhoud van het Stapelveld, dan leidt dit tot een overeenkomstig aantal Top-constituenten, welke alle in het Representatiegeheugen worden opgeschreven. Dit gebeurt door de huidige structuur met de voorlopige functionele labels te kopiëren.If a number of solutions occur with a certain data content of the Stack field, this leads to a corresponding number of Top constituents, all of which are written down in the Representation memory. This is done by copying the current structure with the preliminary functional labels.

Allereerst worden nu de voorlopige functionele labels in het Struc-tuurelementenveld van elke Topconstituent op overeenkomstige wijze vervangen.First of all, the provisional functional labels in the Structural Elements field of each Summit Constituent are now replaced in a corresponding manner.

Vervolgens wordt bij stap 128 een filterproces uitgevoerd aan de hand van een als negende tabellarisch geheugen aan te merken geheugendeel (zie tabel I), waarbij wordt nagegaan of de door de grammaticaregel van dit programmagedeelte voortgebrachte omzetting van voorlopige functionele labels naar definitieve functionele labels wel toegestaan is op grond van verplichte overeenstemming van zekere elementen in het Kenmerkenveld bij sommige, aan elkaar gerelateerde Constituenten. Met name is er zo'n relatie tussen de Constituentstructuren "Onderwerp" en "Werkwoord", waarbij de verbuigingsindicatie consistent dient te zijn. In tabel I staan de daarvoor bestemde grammaticaregels vermeld, welke gerangschikt zijn naar categorieaanduiding van de te onderzoeken constituent. Iedere grammaticaregel bevat voorts een lijst van elementen en een lijst van kenmerken. Van iedere structuur (Constituent of Woord), welke met een van de od de liist. voorkomende elementen overeenkomt, wordt het Kenmerkenveld opgehaald. De doorsnede van al deze Kenmerkenvelden met elkaar en met de gevonden lijst van Kenmerken mag niet leeg zijn.Then, at step 128, a filtering process is performed using a ninth tabular memory to be designated (see Table I), verifying whether the conversion of preliminary functional labels generated by the grammar rule of this program section is permitted is based on the obligatory agreement of certain elements in the Characteristic Field for some related Constituents. In particular, there is such a relationship between the Constituent structures "Subject" and "Verb", where the inflection indication should be consistent. Table I lists the appropriate grammar rules, which are arranged by category indication of the constituent to be investigated. Each grammar line also contains a list of elements and a list of characteristics. Of any structure (Constituent or Word), which with one of the od de liist. matching elements, the Attribute field is retrieved. The intersection of all these Attribute fields with each other and with the found list of Attributes cannot be empty.

Bijvoorbeeld bij de zin "Hij lopen" zal de doorsnede van de Kenmerken van het "Onderwerp" met de gevonden lijst van Kenmerken "sing 3" opleveren. waarna de doorsnede door het inbrengen van de Kenmerken van het "Werkwoord" leeg wordt. Aldus is er sprake van een ongrammaticale zinsconstructie. De verkregen analyse wordt nu minder waarschijnlijk gemaakt door het verlagen van de waarde in het Waarschijnlijkheids-veld, terwijl het Foutmeldingsveld tevens van een foutcode wordt voorzien .For example, at the sentence "He is walking" the cross section of the Characteristics of the "Subject" with the found list of Characteristics will yield "sing 3". after which the section becomes empty by inserting the "Verb" Attributes. There is thus an ungrammatical sentence construction. The analysis obtained is now made less likely by decreasing the value in the Probability field, while also providing the Error message field with an error code.

Na het filterproces keert het programma terug naar stap 115. De stappen 129, 130 en 131 zijn gelijk aan de stappen 126, 127 en 128. Na de "syntax directed parser" zal nu een "syntax embedded parser", welke werkzaam is volgens de beschreven uitvindingsgedachte, worden toegelicht. Laatstgenoemde parser wijkt ten opzichte van de eerstgenoemde parser in die zin af, dat de met een tabellarisch geheugen uit te voeren stappen zijn vervangen door programmatisch uit te voeren handelingen. Als voorbeeld hierbij zal de programmastap 75 (Fig. 7) worden beschouwd, waarbij een gedeelte van de inhoud van tabel E in het flow-diagram van Fig. 12 is weergegeven.After the filtering process, the program returns to step 115. Steps 129, 130 and 131 are the same as steps 126, 127 and 128. After the "syntax directed parser", a "syntax embedded parser", which operates according to the described inventive idea, are explained. The latter parser differs from the former parser in that the steps to be performed with a tabular memory are replaced by actions to be carried out programmatically. As an example, program step 75 (Fig. 7) will be considered, with part of the contents of Table E in the flow chart of Fig. 12 is shown.

In Fig. 12 wordt bij stap 132 de vraag gesteld of de constituent van de categorie "S" is. Is dit het geval (Y), dan volgt programmagedeelte 133, waarna het programma van stap 75 als beëindigd wordt beschouwd.In FIG. 12, at step 132, the question is asked whether the constituent is of the "S" category. If this is the case (Y), program section 133 follows, after which the program of step 75 is considered to have ended.

Is dit niet het geval (N), dan volgt stap 134, alwaar de vraag wordt gesteld of de constituent van het type Srei is. Is dit het geval (Y), dan volgt een daarop gericht onderzoeksprogramma 135. Is zulks niet het geval (N), dan volgt stap 136, alwaar de mogelijke bewerkingsfase bij de constituent Scomp een aanvang neemt onder mogelijke gebruikmaking van stap 137.If not (N), step 134 follows, where the question is asked whether the constituent is of type Srei. If this is the case (Y), then a research program 135 aimed at it follows. If this is not the case (N), step 136 follows, where the possible processing phase starts at the constituent Scomp using step 137.

Het nu volgende programmagedeelte 138 betreffende een NP-constituent met het bijbehorende onderzoeksprogramma zal nu nader worden toegelicht. Bij stap 138 wordt de vraag gesteld of de constituent een NP is. Is het antwoord hierop bevestigend (Y), dan volgt programmastap 139, alwaar de vraag wordt gesteld of het functionele label "head" twee keer achter elkaar voorkomt. Is dit het geval (Y), dan volgt programmastap 152, alwaar de waarde van het Waarschijnlijkheidsveld met een zekere correctiefactor wordt verlaagd, waarna het programma verder gaat naar stap 140. Wordt de vraag bij stap 139 ontkennend beantwoord (N), dan volgen één of meerdere soortgelijke vraagstellingen, die hier niet nader worden uiteengezet, maar die uiteindelijk uitkomen bij programmastap 140. Daarbij wordt de vraag gesteld of in de te onderzoeken constituent het functionele label "head'1 gevolgd wordt door het label "det". Is dit het geval (Y), dan is hier dus sprake van een onjuiste volgorde, en wordt in de daarop volgende stap 141 de waarde van het Waarschijnlijkheidsveld gecorrigeerd met een zekere factor. Bij een ontkennende beantwoording (N) van de vraag bij stap 140 volgt stap 142 met de vraagstelling of het functionele label "head" gevolgd wordt door het label "nmod-a". Is dit het geval (Y), dan wordt bij stap 143 de waarde van het Waarschijnlijkheidsveld van de te onderzoeken Constituent met een zekere factor verlaagd. Bij een ontkennende beantwoording (N) van de vraag bij stap 142 wordt bij een volgende stap een soortgelijke vraagstelling afgehandeld. Uiteindelijk, en derhalve ook na de stappen 141 en 143 komt het programma dan terecht bij de eerste stap 144 van de volgende fase van het onderzoeksprogramma, alwaar de vraag gesteld wordt of het label "det" één keer in de constituent voorkomt. Bij een ontkennend antwoord (N) op deze vraagstelling wordt bij de volgende programmastap 145 de waarde van het Waarschijnlijkheidsveld met een zekere factor verlaagd. Bij een bevestigend antwoord (Y) volgt stap 146, alwaar de vraag wordt gesteld, of de constituent slechts één functioneel label "head" bevat. Is dit niet het geval (N), dan wordt bij stap 147 een verlaging van de waarde van het Waarschijnlijkheidsveld doorgevoerd. Ma de in deze fase uit te voeren programmastappen wordt het programma doorlopen bij stap 75 als beëindigd beschouwd. Wordt de vraag bij stap 138 ontkennend beantwoord (N), dan volgt stap 148 alwaar de vraagstelling een PP-constituent betreft. Bij een bevestiging (Y) van de vraag bij stap 138 wordt een daarop betrekking hebbend onderzoeksprogramma uitgevoerd. Een soortgelijk program-magedeelte maar dan voor een AP-constituent vindt plaats bij stappen 150 en 151, waarna het met stap 75 equivalente programma van Fig. 2 als beëindigd wordt beschouwd.The following program section 138 concerning an NP constituent with the associated research program will now be explained in more detail. At step 138, the question is asked whether the constituent is an NP. If the answer to this is affirmative (Y), program step 139 follows, where the question is asked whether the functional label "head" occurs twice in a row. If this is the case (Y), program step 152 follows, where the value of the Probability field is decreased by a certain correction factor, after which the program proceeds to step 140. If the question at step 139 is answered in the negative (N), one follows or several similar questions, which are not further explained here, but which ultimately end up in program step 140. The question is whether the functional label "head'1 is followed by the label" det "in the constituent to be investigated. case (Y), then this is an incorrect order, and the value of the Probability field is corrected by a certain factor in the next step 141. In case of a negative answer (N) of the question at step 140, step 142 follows asking whether the functional label "head" is followed by the label "nmod-a" If this is the case (Y), in step 143 the value of the Probability field of the test is oeken Constituent decreased by a certain factor. If there is a negative answer (N) to the question at step 142, a similar question is processed in a next step. Eventually, and therefore also after steps 141 and 143, the program then arrives at the first step 144 of the next phase of the research program, where the question is asked whether the label "det" occurs once in the constituent. In the case of a negative answer (N) to this question, the value of the Probability field is decreased by a certain factor in the next program step 145. In the affirmative answer (Y), step 146 follows, where the question is asked whether the constituent contains only one functional label "head". If not (N), a decrease of the Probability field value is made at step 147. However, in the program steps to be performed in this phase, the program run at step 75 is considered completed. If the question is answered negatively (N) in step 138, then step 148 follows where the question concerns a PP constituent. Upon confirmation (Y) of the question at step 138, a related research program is run. A similar program portion but for an AP constituent occurs at steps 150 and 151, after which the program equivalent to step 75 of FIG. 2 is considered terminated.

Een soortgelijke programmatische behandeling is eveneens mogelijk voor de overige tabellen van de "syntax directed parser".A similar programmatic treatment is also possible for the other tables of the "syntax directed parser".

Tabel ATable A

(ap (adj-inflected adj-not-inflected)) (np (definite indefinite neuter inneuter singl sing2 sing3 plul plu2 plu3)) (conj-np (definite indefinite neuter inneuter))(ap (adj-inflected adj-not-inflected)) (np (definite indefinite neuter inneuter singl sing2 sing3 plul plu2 plu3)) (conj-np (definite indefinite neuter inneuter))

Tabel BTable B

(article (np (det)) (pp (det np)) (s (det np)) (s-rel (det np)) (s-comp (det n))) (adj (ap (head)) (np (head ap)) (s (head ap np) (head ap-adj)) (s-rel (head ap np) (head ap-adj)) (s-comp (head ap np) (head ap-adj)) (ap-adj (head))) (adv (s (head ap-adv)) (s-rel (head ap-adv)) (s-comp (head ap-adv))) (advmod (ap (mod ap)) (np (mod ap ap)) (s (mod ap ap np)(mod ap-adj)) (s-rel (mod ap ap np)(mod ap-adj)) (s-comp (mod ap ap np) (mod ap-adj))) (p-word (np (head pp)) (pp (head)) (s (head ppHvfin-particle)) (s-rel (head ppHvfin-particle)) (s-comp (head pp) (vfin-particle))) (c-word (s (comp s-comp))) (noun (np (head)) (pp (head np)) (s (head np)) (s-rel (head np)) (s-comp (head))) (pro-subst (np (head)) (pp (head np)) (s (head np)) (s-rel (head np)) (s-comp (head np))) (pro-rel (np (head np s-rel))) (pro-adj (np (det)) (pp (det np)) (s (det np)) (s-rel (det np)) (s-comp (det np))) (verb (s (vfin-main)) (s-rel (vfin-main)) (s-comp (vfin-main))) (interpunction (s (endmark)))(article (np (det)) (pp (det np)) (s (det np)) (s-rel (det np)) (s-comp (det n))) (adj (ap (head)) ( np (head ap)) (s (head ap np) (head ap-adj)) (s-rel (head ap np) (head ap-adj)) (s-comp (head ap np) (head ap-adj )) (ap-adj (head))) (adv (s (head ap-adv)) (s-rel (head ap-adv)) (s-comp (head ap-adv)))) (advmod (ap ( mod ap)) (np (mod ap ap)) (s (mod ap ap np) (mod ap-adj)) (s-rel (mod ap ap np) (mod ap-adj)) (s-comp (mod ap ap np) (mod ap-adj))) (p-word (np (head pp)) (pp (head)) (s (head ppHvfin-particle)) (s-rel (head ppHvfin-particle)) ( s-comp (head pp) (vfin-particle))) (c-word (s (comp s-comp))) (noun (np (head)) (pp (head np)) (s (head np)) (s-rel (head np)) (s-comp (head))) (pro-subst (np (head)) (pp (head np)) (s (head np)) (s-rel (head np) ) (s-comp (head np))) (pro-rel (np (head np s-rel))) (pro-adj (np (det)) (pp (det np)) (s (det np)) (s-rel (det np)) (s-comp (det np))) (verb (s (vfin-main)) (s-rel (vfin-main)) (s-comp (vfin-main))) (interpunction (s (endmark)))

Tabel CTable C

(np (det nmod-a) head) (pp O pobj) (ap (mod) ) (ap-adj (mod) head) (ap-adv (mod) head) (s () fnp) (s-rel (fnp) vfin-main) (s-comp (fnp) vfin-main) (conj-s (seq)) (conj-pp (seq)) (conj-ap (seq)) (conj-s-rel (seq))(np (det nmod-a) head) (pp O pobj) (ap (mod)) (ap-adj (mod) head) (ap-adv (mod) head) (s () fnp) (s-rel ( fnp) vfin-main) (s-comp (fnp) vfin-main) (conj-s (seq)) (conj-pp (seq)) (conj-ap (seq)) (conj-s-rel (seq) )

Tabel DTable D

(ap (head( adj-inflected adj-not-inflected)) (amod( adj-inflected adj-not-inflected))) (np (det (neuter inneuter sing3 plu3 definite indefinite)) (head( singl sing2 sing3 plul plu2 plu3 neuter inneuter)))(ap (head (adj-inflected adj-not-inflected)) (amod (adj-inflected adj-not-inflected))) (np (det (neuter inneuter sing3 plu3 definite indefinite))) (head (singl sing2 sing3 plul plu2 plu3 neuter inneuter)))

Tabel ETable E

(s ((fnp fnp)(smod smod) (smod fnp) (fnp smod)(vfin-particle fnp) (vfin-particle)(fnp vfin-particle vfin-main fnp) (fnp vfin-particle fnp vfin-main fnp) ) () ((vfin-main 1) (fnp 3))) (s-rel () () ((vfin-main 1) (fnp 3))) (s-comp () () ((vfin-main 1) (fnp 3))) (np ((head head)) ((det head) (nmod-a head) (head nmod-s) (det nmod-s)) ((det 1) (head 1) (nmod-a 1))) (pp ((head head) (head head pobj head)) () {(pobj 1)))(s ((fnp fnp) (smod smod) (smod fnp) (fnp smod) (vfin-particle fnp) (vfin-particle) (fnp vfin-particle vfin-main fnp) (fnp vfin-particle fnp vfin-main fnp )) () ((vfin-main 1) (fnp 3))) (s-rel () () ((vfin-main 1) (fnp 3))) (s-comp () () ((vfin- main 1) (fnp 3))) (np ((head head)) ((det head) (nmod-a head) (head nmod-s) (det nmod-s)) ((det 1) (head 1) (nmod-a 1))) (pp ((head head) (head head pobj head)) () {(pobj 1)))

Tabel FTable F

(np ((neuter inneuter) (singl sing2 sing3 piul pl u2 plu3)) {(definite adj-inflected) (piu3 adj-inflected} (inneuter adj-inflected) (neuter indefinite adj-not-inflected)))(np ((neuter inneuter) (singl sing2 sing3 piul pl u2 plu3)) {(definite adj-inflected) (piu3 adj-inflected} (inneuter adj-inflected) (neuter indefinite adj-not-inflected)))

Tabel GTable G

(np (conj-np conj) (pp pobj) (s fnp) (s-rel fnp) (s-comp fnp)) (conj-np (pp pobj) (s fnp) (s-rel fnp) (s-comp fnp)) (ap (ap amod) (np nmod-a) (s smod) (s-rel smod) (conj-ap conj) (s-comp smod)) (conj-ap (ap amod) (np nmod-a) (s smod) (s-rel smod) (s-comp smod)) (ap-adj (s fnp) (s-rel fnp) (s-comp fnp)) (ap-adv (s smod) (s-rel smod) (s-comp smod)) (pp (np nmod-p) (s smod) (s-rel smod) (conj-pp conj) (s-comp conj)) (conj-pp (np nmod-p) (s smod) (s-rel smod) (s-comp smod)) (s (conj-s conj )) (s-comp (s fnp)) (s-rel (np nmod-s) (conj-s-rel conj)) (conj-s-rel (np nmod-s))(np (conj-np conj) (pp pobj) (s fnp) (s-rel fnp) (s-comp fnp)) (conj-np (pp pobj) (s fnp) (s-rel fnp) (s- comp fnp)) (ap (ap amod) (np nmod-a) (s smod) (s-rel smod) (conj-ap conj) (s-comp smod)) (conj-ap (ap amod) (np nmod -a) (s-smod) (s-rel smod) (s-comp smod)) (ap-adj (s fnp) (s-rel fnp) (s-comp fnp)) (ap-adv (s smod) ( s-rel smod) (s-comp smod)) (pp (np nmod-p) (s smod) (s-rel smod) (conj-pp conj) (s-comp conj)) (conj-pp (np nmod -p) (s-smod) (s-rel smod) (s-comp smod)) (s (conj-s conj)) (s-comp (s fnp)) (s-rel (np nmod-s) (conj -s-rel conj)) (conj-s-rel (np nmod-s))

Tabel Η {{fnp vfin-main endmark) (subj )) ((fnp vfin-main smod endmark) (subj )) ((smod vfin-main fnp endmark) (subj)) ((smod vfin-main fnp smod endmark) (subj)) ({vfin-main fnp endmark) (obj)) ((vfin-main fnp smod endmark) (obj)) ((fnp vfin-main fnp endmark) (subj obj)(obj subj)) ((fnp vfin-main smod fnp endmark) (subj obj)) ((smod vfin-main fnp fnp endmark) (subj obj)) ((fnp vfin-main fnp smod endmark) (subj obj)(obj subj)) ((vfin-main fnp fnp endmark) (indobj obj)) ((vfin-main fnp smod fnp endmark) (indobj obj)) ((vfin-main fnp fnp smod endmark) (indobj obj)) ((vfin-main smod fnp fnp endmark) (indobj obj)) ((fnp vfin-main fnp smod smod endmark) (subj obj)(obj subj)) ((fnp vfin-main smod fnp smod endmark) (subj obj)) ;etc ((fnp vfin-main fnp fnp endmark) (subj indobj obj) (indobj subj obj)(obj subj indobj)) ((fnp vfin-main smod fnp fnp endmark) (subj indobj obj)) ((smod vfin-main fnp fnp fnp endmark) (subj indobj obj)) ((fnp vfin-main fnp smod fnp endmark) (subj indobj obj) (indobj subj obj)(obj subj indobj)) ((fnp vfin-main fnp fnp smod endmark) (subj indobj obj) (indobj subj obj)(obj subj indobj)) ;etc ((fnp vfin-main) (subj)) ((fnp fnp vfin-main) (subj obj)) ((fnp fnp fnp vfin-main) (subj indobj obj))Table Η {{fnp vfin-main endmark) (subj)) ((fnp vfin-main smod endmark) (subj)) ((smod vfin-main fnp endmark) (subj)) ((smod vfin-main fnp smod endmark) (subj)) ({vfin-main fnp endmark) (obj)) ((vfin-main fnp smod endmark) (obj)) ((fnp vfin-main fnp endmark) (subj obj) (obj subj)) ((fnp vfin-main smod fnp endmark) (subj obj)) ((smod vfin-main fnp fnp endmark) (subj obj)) ((fnp vfin-main fnp smod endmark) (subj obj) (obj subj)) ((vfin- main fnp fnp endmark) (indobj obj)) ((vfin-main fnp smod fnp endmark) (indobj obj)) ((vfin-main fnp fnp smod endmark) (indobj obj)) ((vfin-main smod fnp fnp endmark) (indobj obj)) ((fnp vfin-main fnp smod smod endmark) (subj obj) (obj subj)) ((fnp vfin-main smod fnp smod endmark) (subj obj)); etc ((fnp vfin-main fnp fnp endmark) (subj indobj obj) (indobj subj obj) (obj subj indobj)) ((fnp vfin-main smod fnp fnp endmark) (subj indobj obj)) ((smod vfin-main fnp fnp fnp endmark) (subj indobj obj)) ((fnp vfin-main fnp smod fnp endmark) (subj indobj obj) (indobj subj obj) (obj subj indobj)) ((fnp vfin-main fnp fnp smod endmark) (subj indobj obj) (indobj subj obj) (obj subj indobj)); etc ((fnp vfin-main) (subj)) ( (fnp fnp vfin-main) (subj obj)) ((fnp fnp fnp vfin-main) (subj indobj obj))

Tabel ITable I

(s ((subj vfin-main) (sing3 sing2 singl plu3 pl u2 plul)) ((subj ) (nominative definite indefinite proper)) ((obj) (not-nominative definite indefinite proper)) ((indobj ) (not-nominative definite indefinite proper))) (s-rel ((subj vfin-main) (sing3 sing2 singl plu3 plu2 plul)) ((subj-rel vfin-main) (sing3 plu3 singl sing2 plul plu2)) ((subj) (nominative definite indefinite proper)) ((obj) (not-nominative definite indefinite proper)) ((indobj) (not-nominative definite indefinite proper))) (s-comp ((subj vfin-main) (sing3 sing2 singl plu3 plu2 plul)) ((subj) (nominative definite indefinite proper)) ((obj) (not-nominative definite indefinite proper)) ((indobj) (not-nominative definite indefinite proper)))(s ((subj vfin-main) (sing3 sing2 singl plu3 pl u2 plul)) ((subj) (nominative definite indefinite proper)) ((obj) (not-nominative definite indefinite proper)) ((indobj) (not- nominative definite indefinite proper))) (s-rel ((subj vfin-main) (sing3 sing2 singl plu3 plu2 plul)) ((subj-rel vfin-main) (sing3 plu3 singl sing2 plul plu2)) ((subj) ( nominative definite indefinite proper)) ((obj) (not-nominative definite indefinite proper)) ((indobj) (not-nominative definite indefinite proper))) (s-comp ((subj vfin-main) (sing3 sing2 singl plu3 plu2 plul)) ((subj) (nominative definite indefinite proper)) ((obj) (not-nominative definite indefinite proper)) ((indobj) (not-nominative definite indefinite proper)))

Claims

1. Method for parsing a phrase, expressed in natural language, in phrases to be described with functional indications on the basis of word units which have been translated into verbal categories and application-oriented categories, which method is used by the following, per word unit steps to be taken from the sentence are characterized: a. Determining per word unit and per constituent the functional word category within the constituent and / or within a new constituent to be created, on the basis of information about the verbal category of the relevant word unit and the category of the constituent, b. Describing for each constituent on the basis of data about the category of the constituent and the category of the constituent dominating this constituent a measure concerning the closure of the constituent, as well as assigning a functional label, whether or not provisional, constituent to be closed, c. Testing the present constituent against syntax-based rules regarding the coherence of words and / or constituents within this constituent in at least one of the latter two steps, and, if necessary, revalue an expectation factor assigned to the sentence representation, as well as selecting any sentence representation whose expectation factor is above a certain threshold.

Method for parsing a sentence in natural language according to claim 1, characterized in that determining the functional category per word unit per constituent also takes place on the basis of the functional category of the previous word unit.

Method for parsing a sentence in natural language according to claim 1, which method, further to testing the present constituent and, if necessary, revaluing the expectation factor, is further characterized by replacing a provisional functional label of a closed constituent by a final functional label.

Method for parsing a natural language sentence according to claim 3, which method, in addition to the replacement of a provisional functional label, is characterized by revaluing the expectation factor assigned to the relevant sentence representation and selecting the sentence representation with the highest expectation factor.

Method for parsing a natural language sentence according to claim 1, characterized in that after determining that there is no expectation factor assigned to a sentence representation which is above the threshold value used, this threshold value is lower value is set.

Method for parsing a natural language sentence according to claim 1, which method is characterized by determining the functional word category within that constituent and / or the associated word component per word unit, per word structure concerning that word unit and successively for each generated constituent new constituent to be created.

Method for parsing a natural language sentence according to claim 6, characterized in that after determining the functional word category within the constituent and / or new constituent to be created for all word structures relating to a word unit, the description of a functional word category is defined. the conclusion of the constituent concerning measure takes place.

Method for parsing a sentence in natural language according to claim 6, characterized in that each time after the per word structure concerning a word unit, a determination of the functional word category within the constituent and / or a new constituent to be created is determined each time. description of the conclusion of the constituent concerning measure takes place.

Method for parsing a natural language sentence according to claim 1, which method is characterized by determining the functional word category within the constituent and / or new for each word constituent, per word structure concerning that word unit and for each generated constituent. constituent to be created and description of the conclusion of the constituent concerning measure.

A method for parsing a natural language set according to claim 1, which method is characterized by determining the functional word category within that constituent and / or for each word structure concerning that word unit, successively for each word structure concerning that word unit and / or the newly created constituent.

Method for parsing a sentence in natural language according to claim 10, characterized in that after determining the functional word category for all constituents within that constituent and / or new constituent to be created, the description of a the constituent concerning measure is closed.

Method for parsing a sentence in natural language according to claim 10, characterized in that each time after the determination of the functional word category within that constituent and / or a new constituent to be created per constituent, a description is given of a the conclusion of the constituent concerning measure takes place.

Method for parsing a natural language sentence according to claim 1, which method is characterized by determining the functional word category within the constituent and / or new for each word structure, per constituent created and for each word structure concerning that word unit. constituent to be created and description of the conclusion of the constituent concerning measure.

Method for parsing a natural language sentence according to claim 1, which method for determining a functional category is characterized by the following steps: a. Using information regarding the word structure and that of the offered selecting a constituent in a memory of information related to the functional label to be assigned the word unit, b. Adding the word unit with selected functional label to the offered constituent, c. Examining each constituent for the presence of incorrect combinations, incorrect order and incorrect number of the assigned functional label, and d. Testing the constituent's probability factor against the applicable threshold value.

A method of parsing a natural language phrase according to claim 1, which method relates to