JPH03137771A

JPH03137771A - Syntax analyzer

Info

Publication number: JPH03137771A
Application number: JP1276273A
Authority: JP
Inventors: Katsuhiko Fujita; 克彦藤田
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1989-10-24
Filing date: 1989-10-24
Publication date: 1991-06-12

Abstract

PURPOSE:To improve the efficiency of processing and to economize a memory by controlling the progression of grammatical construction based on the depth of stacks. CONSTITUTION:One word is read in from the start of a sentence by a word reading-in part 1 with respect to a word string, and a dictionary part 5 is used to perform the dictionary retrieval by a dictionary retrieving part 2. Thereafter, a phrase structure rule part 6 is used to apply phrase structure rules to results of the dictionary retrieval by a rule applying part 3, and results are preserved in a stack area part 8. At this time, each stack of the stack area part is checked by a stack check part 7, and stacks which do not accord with conditions are erased.

Description

【発明の詳細な説明】産業上の利用分野本発明は、自然言語処理における構文解析装置に関する
。DETAILED DESCRIPTION OF THE INVENTION Field of the Invention The present invention relates to a parsing device for natural language processing.

従来の技術従来、構文解析装置において、入力された文字列に対し
、文の先頭からｌ単語を読込み、辞；讐を用いて辞書検
索を行い、句構造規則を用いて文頭から文末に向けてボ
トムアップに構文解析を行う方式は良く知られている。Conventional technology Conventionally, in a syntactic analysis device, for an input character string, one word is read from the beginning of the sentence, a dictionary search is performed using the words; The bottom-up syntax analysis method is well known.

このような構文解析の方式において、その解の出し方に
は一般に次の２つの方法がある。In such a syntax analysis method, there are generally two methods for finding a solution.

■）構文解析結果を全て出す（Ｐａｒａｌｌｅｌ　Ｐａ
ｒｓｉｎｇ）２）構文解析結果を１つだけ出す（Ｓｅｑ
ｕｅｕｔｉａｌ　Ｐａｒｓｉｎｇ）発明が解決しようと
する課題上述したような２つの方式のうち１）の方式をとる場合
において問題となる点がある。その問題点とは、全ての
可能な解を出すためにどんなに不自然であっても可能と
思われる解析途中結果を（ｙ持しなければならないとい
う点である。このため文がもし長くなるよってあれば、
その文だけ保持しなければならない解析途中結果の数が
爆発的に増加することになる（これを、絹合せ爆発とい
う）。■) Output all syntax analysis results (Parallel Pa
rsing) 2) Output only one syntax analysis result (Seq
Problems to be Solved by the Invention There are problems when using method 1 of the above two methods. The problem is that in order to come up with all possible solutions, it is necessary to have (y) intermediate analysis results that are considered possible no matter how unnatural.For this reason, if the sentence becomes long, if there is,
This results in an explosive increase in the number of intermediate analysis results that must be retained for that sentence (this is referred to as a "combination explosion").

課題を解決するための手段そこで、このような問題点を解決するために、本発明は
、文の先頭から１単語ずつを読込み、辞書を用いて辞書
検索を行い、句構造規則を用いて文頭から文末に向けて
ボトムアップに構文解析を行う構文解析装置において、
前記構文解析の途中結果をスタック形式で記憶するスタ
ック領域部を設け、そのスタックの深さに基づいて前記
構文の進行を制御するスタック検査部を設けた。Means for Solving the Problems Therefore, in order to solve these problems, the present invention reads each word from the beginning of a sentence, performs a dictionary search using a dictionary, and uses phrase structure rules to find the beginning of the sentence. In a syntax analyzer that performs syntax analysis from the bottom up from the beginning to the end of the sentence,
A stack area section for storing intermediate results of the syntax analysis in a stack format is provided, and a stack inspection section is provided for controlling the progress of the syntax based on the depth of the stack.

作用従って、スタック領域部を用いて構文解析の途中結果を
部分木からなるスタックで表現し、スタック検査部を用
いてそのスタックの深さを基準に不自然な解析途中の結
果を消去することが可能となり、これによりスタック深
さに基づき無駄な或いは不自然なスタックを捨てること
によって、処理の効率化を図ることができる。Therefore, it is possible to use the stack area section to express the intermediate results of syntax analysis as a stack consisting of subtrees, and use the stack inspection section to erase unnatural intermediate results of the analysis based on the depth of the stack. This makes it possible to improve processing efficiency by discarding useless or unnatural stacks based on stack depth.

実施例本発明の一実施例を図面に基づいて説明する。Example An embodiment of the present invention will be described based on the drawings.

第１図は本装置の基本的な構成を示したものである。こ
の場合、単語列の入力される単語読込み部ｌ、辞書検索
部２、規則適用部３、最終判定部４が、辞書部５、句構
造規則部６を用いたボトムアップの構文解析の方式は周
知の構成であり、本実施例ではスタック領域部７及びス
タック検査部８を設けたことに特徴がある。前記スタッ
ク領域部７は、構文解析の途中結果をスタック形式で記
憶する働きがあり、前記スタック検査部８は、スタック
の深さに基づいて構文の進行を制御する働きがある。な
お、＃は制御の流れを、→はデータの流れをそれぞれ示
す。FIG. 1 shows the basic configuration of this device. In this case, the word reading unit 1 into which the word string is input, the dictionary search unit 2, the rule application unit 3, and the final judgment unit 4 perform bottom-up syntax analysis using the dictionary unit 5 and the phrase structure rule unit 6. This is a well-known configuration, and this embodiment is characterized by the provision of a stack area section 7 and a stack inspection section 8. The stack area section 7 has the function of storing intermediate results of syntax analysis in a stack format, and the stack inspection section 8 has the function of controlling the progress of the syntax based on the depth of the stack. Note that # indicates the flow of control, and → indicates the flow of data.

また、第２図は本装置の基本的処理の流れを示すもので
あり、今、この図に基づいて第１図の全体的な流れにつ
いて述べる。Furthermore, FIG. 2 shows the basic processing flow of this apparatus, and the overall flow of FIG. 1 will now be described based on this diagram.

単語列に対して単語読込み部ｌにより文の先頭から１単
語を読込み、辞書部５を用いて辞書検索部２によりそれ
に対する辞書検索を行う。その後、その辞書検索の結果
に対して、句構造規則部６を用いて規則適用部３により
句構造規則の適用を行い、その結果をスタック領域部８
に保存する。この時、スタック検査部７によりそのスタ
ック領域部７の各スタックに対する検査を行い、条件に
合致しないスタックは消去される。そして、すべてのス
タックに対し検査が終了すると、再び初期状態に戻り次
の単語に対する処理に移行する。このようにして最終的
に単語がつきると、最終判定処理、すなわち、文かどう
かのチエツクが行われ一連の作業が終了することになる
。A word reading unit 1 reads one word from the beginning of a sentence into a word string, and a dictionary search unit 2 uses a dictionary unit 5 to perform a dictionary search for the word. Thereafter, the phrase structure rule is applied to the result of the dictionary search by the rule applying unit 3 using the phrase structure rule unit 6, and the result is transferred to the stack area unit 8.
Save to. At this time, the stack inspection section 7 inspects each stack in the stack area section 7, and stacks that do not meet the conditions are erased. When all stacks have been inspected, the process returns to the initial state and starts processing the next word. When a word is finally found in this way, a final judgment process is performed, that is, a check is made to see if it is a sentence, and the series of operations ends.

第５図は句構造規則の例を示したものであり、これは句
構造規則部６に保持されている。また、第６図は単語の
例を示したものであり、少なくとも「表記」と「品詞」
とを有した形となっており、これらは辞書部５に保持さ
れている。FIG. 5 shows an example of phrase structure rules, which are held in the phrase structure rule section 6. In addition, Figure 6 shows examples of words, including at least the ``notation'' and ``part of speech.''
These are held in the dictionary section 5.

吹に、スタック検査部８における処理について述べる。First, the processing in the stack inspection section 8 will be described.

第３図及び第４図はスター・　検査部８におけるアルゴ
リズムの様子を示しｔ　２のである。3 and 4 show the state of the algorithm in the star check section 8 at t2.

この場合、第３図は、スタック深さ・い決められた値（
ｍａｘ）以上のスタックを消去する場合の例を示したも
のである。また、第４図は、スタック領域中で一番浅い
スタックよりも一定数以上深い（ｍｉｎ十〇）スタノ、
りを消去する場合の例を示したものである。In this case, Figure 3 shows that the stack depth
This shows an example of erasing a stack larger than max). In addition, FIG. 4 shows a stano stack that is deeper than the shallowest stack in the stack area by a certain number of points (min 10),
This shows an example of erasing the data.

ここで、以下に示すような「文」が入力された場合にお
けるスタック処理について調べる。Here, we will examine stack processing when a "sentence" as shown below is input.

家　へ　向かう」という文があるとする。Heading home.” Suppose there is a sentence.

この時、解析途中におけるスタック状態を第７図に示す
。なお、→は辞書引きを、−は規則適用を示したもので
ある。また、その→上の記号は前述した第５図の句構造
規則に示した規則番号に対応している。従って、この第
７図かられかるように、本実施例においては、もつとも
深いスタックにおいても６段て済むことがわかる。At this time, the stack state during the analysis is shown in FIG. Note that → indicates dictionary lookup, and - indicates rule application. Further, the symbol above → corresponds to the rule number shown in the phrase structure rules of FIG. 5 mentioned above. Therefore, as can be seen from FIG. 7, it can be seen that in this embodiment, even a deep stack can have only six stages.

ここで、上述した文の解析過程を第８図の従来例と比佼
してみる。スタック深さに制限を設けていない場合、第
８図の左側に示したような１０段にも及ぶスタックが生
じることがある。これは、句構造規則の適用によってそ
の右側に示したような９段にまでは縮退するが、しかし
、最終的にはＳにまで到達しない、いわゆる、無駄なス
タックとなる。このようなことは、部分木がある程度以
上の数になると記・むきするにも不都合が生じる人間の
認知のプロセスから判断してもわかることてあり、あま
り深いスタックは不自然な状態を示すことになる。従っ
て、このようなことから本実施例のようにスタックの深
さに制約を設けることに意義があるわけである。Let us now compare the sentence analysis process described above with the conventional example shown in FIG. If no limit is placed on the stack depth, a stack of as many as 10 levels as shown on the left side of FIG. 8 may occur. By applying phrase structure rules, this stack degenerates to nine stages as shown on the right side, but ultimately does not reach S, resulting in a so-called useless stack. This can be seen from the human cognitive process, where it becomes inconvenient to record and peel subtrees when the number of subtrees exceeds a certain level, and a stack that is too deep can result in an unnatural state. become. Therefore, for this reason, it is meaningful to impose restrictions on the depth of the stack as in this embodiment.

発明の効果本発明は、文の先頭から１単語ずつを読込み、辞書を用
いて辞書検索を行い、句構造規則を用いて文頭から文末
に向けてボトムアップに構文解析を行う構文解析装置に
おいて、０１１記構文解析の途中結果をスタック形式で
記憶するスタック領域部を設け、そのスタックの深さに
基づいて前記構文の進行を制御するスタック検査部を設
けたので、そのスタック領域部を用いて構文解析の途中
結果を部分木からなるスタックで表現し、スタック検査
部を用いてそのスタックの深さを基市に不自然な解析途
中の結果を消去することが可能となり、これにより、ス
タック深さに基づき無駄な或いは不自然なスタックを捨
てることによって、処理の高効率化及びメモリの節約を
図ることができるものである。Effects of the Invention The present invention provides a syntax analysis device that reads each word from the beginning of a sentence, performs a dictionary search using a dictionary, and performs syntax analysis from the beginning to the end of the sentence from the bottom up using phrase structure rules. A stack area section is provided to store intermediate results of syntax analysis in a stack format, and a stack inspection section is provided to control the progress of the syntax based on the depth of the stack. It is now possible to express the intermediate results of analysis as a stack consisting of subtrees, and use the stack inspection unit to erase unnatural intermediate results based on the depth of the stack. By discarding unnecessary or unnatural stacks based on this, it is possible to improve processing efficiency and save memory.

[Brief explanation of the drawing]

第１図は本発明の一実施例を示すブロック図、第２図は
その基本的な全体の流れを示すフローチャート、第３図
及び第４図はスタック検査のアルゴリズム、第５図は句
構造規則の例を示す説明図、第６図は辞書中の単語の例
を示す説明図、第７図はスタックの解析途中の様子を示
す説明図、第８図は従来におけるスタックの様子を示す
説明図である。１・・単語読込み部、２・辞書検索部、３・・規則適用
部、４・・・最終判定部、５・辞書部、６　・句構造規
則部、７・・・スタック領域部、８・・スタック検査部３　」　図１２図出　、願　人　　　　株式会社　リ　コ７図先生の家へ向かうFigure 1 is a block diagram showing an embodiment of the present invention, Figure 2 is a flowchart showing the basic overall flow, Figures 3 and 4 are stack inspection algorithms, and Figure 5 is phrase structure rules. FIG. 6 is an explanatory diagram showing an example of words in a dictionary. FIG. 7 is an explanatory diagram showing a state in the middle of stack analysis. FIG. 8 is an explanatory diagram showing a conventional stack state. It is. 1. Word reading section, 2. Dictionary search section, 3. Rule application section, 4. Final judgment section, 5. Dictionary section, 6. Phrase structure rule section, 7. Stack area section, 8.・Stack Inspection Department 3'' Figure 12, Applicant Rico Co., Ltd. Figure 7 Heading to Professor's House

Claims

[Claims]

In a syntactic analysis device that reads each word from the beginning of a sentence, performs a dictionary search using a dictionary, and performs syntactic analysis from the beginning to the end of the sentence from the bottom up using phrase structure rules, the intermediate results of the syntactic analysis are A syntax analysis device, comprising: a stack area section for storing data in a stack format; and a stack inspection section for controlling the progress of the syntax based on the depth of the stack.