JPH07262189A

JPH07262189A - Extracting device for sentential form pattern

Info

Publication number: JPH07262189A
Application number: JP6051463A
Authority: JP
Inventors: Yuichi Tanaka; 裕一田中
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1994-03-23
Filing date: 1994-03-23
Publication date: 1995-10-13

Abstract

PURPOSE:To easily correct a syntax analysis rule by preparing a large number of texts in a targeted range, integrating all of them by sampling the sentential patterns of them, automatically collecting all the sentential patterns and receiving the support of those sentential patterns. CONSTITUTION:This device is constituted of a morpheme analysis part 2 which applies morpheme analysis to a large number of texts, a sentential pattern sampling part 3 which samples the sentential pattern by applying a sentential pattern sampling rule 4 generated in advance to a result of application of the morpheme analysis by the morpheme analysis part 2 and a sentential pattern integrating part 6 which integrates a set of sentential patterns sampled by the sentential pattern sampling part 3.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、テキストから文型パタ
ンを自動抽出する文型パタン抽出装置であって、自然言
語処理における構文規則の作成およびキーワードの抽出
規則の作成などのための基礎的な資料となる文型パタン
を自動抽出する装置であり、特に多量の入力テキストを
与えて自動的に文型パタンを抽出・作成する文型パタン
抽出装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention is a sentence pattern extracting device for automatically extracting sentence patterns from text, and is a basic material for creating syntax rules and keyword extraction rules in natural language processing. The present invention relates to a device for automatically extracting sentence pattern patterns, which is particularly relevant to a sentence pattern extraction device for automatically extracting and creating sentence patterns by giving a large amount of input text.

【０００２】[0002]

【従来の技術】自然言語処理において、入力されたテキ
ストの文の構造を解析するための構文解析は最も基本的
な処理の１つである。このためには、予め用意された構
文解析規則を用い、その構文解析規則に従って文要素の
書き換えを繰り返すことにより、文全体の構造を解析す
る。ここで用いられる構文解析規則は、多くの言語学者
が理論的に提案した文法を計算機で処理可能な形式に書
き直したものが用いられるが、世の中に流通する現実の
テキスト中の文においては例外が多く、実際にはそのよ
うな構文解析規則をかなりの部分を修正することが必要
となる。2. Description of the Related Art In natural language processing, syntactic analysis for analyzing the sentence structure of an input text is one of the most basic processing. For this purpose, the syntax analysis rule prepared in advance is used, and the structure of the entire sentence is analyzed by repeating the rewriting of the sentence element in accordance with the syntax analysis rule. The parsing rules used here are the grammars theoretically proposed by many linguists, rewritten into a form that can be processed by a computer.However, there are exceptions in the sentences in the actual text distributed in the world. In many cases, it will actually be necessary to modify such parsing rules to a large extent.

【０００３】また、多量のテキストからの情報抽出の一
環として、キーワードの抽出や、複数のキーワード間の
関係の抽出、更にテキストの要約などの知識の抽出の目
的のために、文型パタンに関する規則を用意し、この文
型パタンにマッチした文から、それぞれのパタンに依存
した箇所の情報を抽出する方法がとられる。Further, as a part of information extraction from a large amount of text, rules regarding sentence pattern are defined for the purpose of extracting keywords, relationships between a plurality of keywords, and knowledge such as text summarization. There is a method of preparing and extracting information of a portion depending on each pattern from a sentence matching this sentence pattern.

【０００４】以上のように、構文解析規則や文型パタン
は広範囲の自然言語処理のために必須のものであるが、
従来は作成する人間の言語的洞察に重きを置いて理論的
な方法で作り上げていく方法がとられていた。As described above, the parsing rules and sentence pattern are essential for a wide range of natural language processing.
In the past, the method has been used in which the linguistic insights of the human being created are emphasized and created in a theoretical way.

【０００５】[0005]

【発明が解決しようとする課題】上述した方法では、作
成された規則を対象テキストに適用して評価し、その結
果をもとに修正を繰り返すので、一般に、非常に微妙で
困難な修正作業が長期間続くことが多いという問題があ
る。更に、そうして作成された規則が対象テキスト全体
に亘って正しく適用できるか、あるいは新しく対象テキ
ストが加わった場合に、それに対しても同様に正しく適
用できるかについては、確実な保証がないために、修正
作業が果てしなく続くという現状となってしまう問題が
あった。In the above-mentioned method, since the created rule is applied to the target text and evaluated, and the correction is repeated based on the result, generally, a very delicate and difficult correction work is performed. The problem is that it often lasts for a long time. Furthermore, there is no definite guarantee that the rules thus created will be correctly applied to the entire target text, or to new target texts as well. In addition, there was a problem that the correction work would continue endlessly.

【０００６】本発明は、これらの問題を解決するため、
対象とする範囲の多量のテキストを用意し、その全体に
亘って文型パタンを抽出して統合し、自動的に全ての文
型パタンを収集すると共にこの文型パタンの支援を受け
て構文解析規則の修正を簡易に可能にすることを目的と
している。The present invention solves these problems.
Prepare a large amount of text in the target range, extract and integrate sentence pattern patterns over the entire text, collect all sentence patterns automatically, and correct parsing rules with the support of this sentence pattern. The purpose is to enable easily.

【０００７】[0007]

【課題を解決するための手段】図１は、本発明の原理ブ
ロック図を示す。図１において、テキストベース部１
は、多量のテキスト（文）を蓄積するものである。多量
のテキストから選択条件に合致したテキストを選択して
取り出すことができる。ここで、選択条件としては、テ
キストの分野を指定する条件の他に、表層的な文字列を
指定する条件などがある。これら選択条件に合致したテ
キスト（文）を選択し、後述する文型パタンの自動抽出
を行う。FIG. 1 shows a block diagram of the principle of the present invention. In FIG. 1, the text base portion 1
Is for accumulating a large amount of text. Text that matches the selection condition can be selected and extracted from a large amount of text. Here, as the selection condition, there are a condition for designating a text field and a condition for designating a surface character string. Texts (sentences) that match these selection conditions are selected, and sentence pattern patterns to be described later are automatically extracted.

【０００８】形態素解析部２は、テキストを形態素解析
するものである。文型パタン抽出部３は、形態素解析さ
れた結果に文型パタン抽出規則４を適用し、文型パタン
を抽出するものである。The morphological analysis unit 2 performs morphological analysis on text. The sentence pattern extracting unit 3 applies the sentence pattern extraction rule 4 to the result of the morphological analysis to extract sentence patterns.

【０００９】文型パタン抽出規則４は、形態素解析され
た結果から文型パタンを抽出するための規則である。文
型パタン５は、文型パタン抽出部３によって文から抽出
された文型パタンの集合である。The sentence pattern extraction rule 4 is a rule for extracting sentence patterns from the result of morphological analysis. The sentence pattern 5 is a set of sentence patterns extracted from the sentence by the sentence pattern extraction unit 3.

【００１０】文型パタン統合部６は、文型パタン抽出部
３によって抽出された文型パタンの集合を統合するもの
である。構文解析部７は、自然言語文に構文解析規則８
を適用して構文解析するものである。The sentence pattern integration unit 6 integrates a set of sentence patterns extracted by the sentence pattern extraction unit 3. The syntactic analysis unit 7 converts the syntactic analysis rule 8 into a natural language sentence.
Is applied for parsing.

【００１１】結果表示・編集部９は、構文解析部７によ
って構文解析された構文解析結果および使用した構文解
析規則８を表示したり、構文解析規則８の修正に対応し
て修正後の構文解析結果などを表示したりするものであ
る。The result display / editing unit 9 displays the syntactic analysis result syntactically analyzed by the syntactic analyzing unit 7 and the syntactic analysis rule 8 used, and, in response to the modification of the syntactic analysis rule 8, the modified syntactic analysis. The result is displayed.

【００１２】[0012]

【作用】本発明は、図１に示すように、形態素解析部２
が入力された多量のテキストから１文ずつ形態素解析
し、文型パタン抽出部３がこの形態素解析結果に文型パ
タン抽出規則４を適用して文型パタン５を抽出し、文型
パタン統合部６がこれら文型パタン５の集合をまとめて
統合した文型パタンを生成するようにしている。In the present invention, as shown in FIG.
Morphological analysis is performed one sentence at a time from a large amount of input text, and the sentence pattern extraction unit 3 applies sentence pattern pattern extraction rules 4 to the morphological analysis result to extract sentence patterns 5, and the sentence pattern integration unit 6 extracts these sentence patterns. A set of pattern 5 is collected and integrated to generate a sentence pattern.

【００１３】また、自然言語文について、構文解析部７
が予め作成された構文解析規則８を適用して構文解析処
理を行い、その構文解析結果を表示させると共に当該構
文解析結果を得るのに使用した構文解析規則８を当該構
文解析結果の該当部分に対応づけて表示させ、結果表示
・編集部８が自然言語文に関する統合された文型パタン
を取り出して併せて表示し、表示された構文解析結果に
使用した構文解析規則８の修正指示に対応して、統合さ
れた文型パタンに適合したときに修正後の構文解析規則
８を構文解析部７に渡し、構文解析処理を行わせ、その
構文解析結果を表示すると共に当該構文解析結果を得る
のに使用した構文解析規則８を対応づけて表示すること
を繰り返し、構文解析規則８を編集するようにしてい
る。Also, for the natural language sentence, the syntax analysis unit 7
Applies the previously created syntactic analysis rule 8 to perform syntactic analysis processing, displays the syntactic analysis result, and applies the syntactic analysis rule 8 used to obtain the syntactic analysis result to the relevant part of the syntactic analysis result. Correspondingly displayed, the result display / editing unit 8 extracts and displays the integrated sentence pattern regarding the natural language sentence together, and responds to the correction instruction of the syntax analysis rule 8 used for the displayed syntax analysis result. , Passes the corrected syntactic analysis rule 8 to the syntactic analysis unit 7 when it is adapted to the integrated sentence pattern pattern, causes the syntactic analysis process to be performed, displays the syntactic analysis result, and is used to obtain the syntactic analysis result. The syntax analysis rule 8 is edited and displayed repeatedly to edit the syntax analysis rule 8.

【００１４】従って、対象とする範囲の多量のテキスト
を用意し、その全体に亘って文型パタンを抽出して統合
したり、文型パタンを参照して構文解析規則８を修正し
て望ましい構文解析結果を生成するように編集すること
により、多量の文から自動的に全ての文型パタンを収集
することが可能となると共に、文型パタンの支援を受け
て構文解析規則を容易に修正・作成などすることが可能
となる。Therefore, a large amount of text in the target range is prepared, the sentence pattern is extracted and integrated over the entire text, or the syntactic analysis rule 8 is modified by referring to the sentence pattern to obtain a desired syntactic analysis result. It is possible to automatically collect all sentence patterns from a large number of sentences by editing so as to generate, and to easily revise and create parsing rules with the support of sentence patterns. Is possible.

【００１５】[0015]

【実施例】次に、図２の（ａ）、および図３から図６を
用い、多量の文から文型パタンを自動抽出するときの図
１の構成の動作を詳細に説明する。EXAMPLE Next, the operation of the configuration of FIG. 1 when automatically extracting sentence pattern from a large number of sentences will be described in detail with reference to FIG. 2 (a) and FIGS. 3 to 6.

【００１６】図２の（ａ）は、文型パタンの自動抽出フ
ローチャートを示す。図２の（ａ）において、Ｓ１は、
１文の取り込みを行う。これは、図１のテキストベース
部１から多量のテキスト（文）から１文、例えば選択条
件で指定した特定分野の多量のテキスト（文）から１文
の取り込みを行う。FIG. 2A shows a flowchart for automatically extracting sentence pattern. In FIG. 2A, S1 is
Take in one sentence. This is to fetch one sentence from a large amount of texts (sentences) from the text base unit 1 of FIG. 1, for example, one sentence from a large amount of texts (sentences) in a specific field designated by the selection condition.

【００１７】Ｓ２は、形態素解析する。これは、Ｓ１で
取り込んだ１文の形態素解析を行う。例えば１文「親ク
ラスのスロットを親スロットと呼ぶ。」を形態素解析し
てそれ以上分割すると意味をなさなくなる単位に分割す
る。ここでは、図３の（ａ）に示す親クラスのスロットを親スロットと呼ぶ
。の下線のように分割する。In S2, morphological analysis is performed. This is S1
Perform morphological analysis of the captured sentence. For example, one sentence "Parent
The slot of Russ is called the parent slot. Morphological analysis
If it is divided further, it will not make sense.
It Here, as shown in FIG.Parent class of slot To Parent slot When Call
. Split as underlined.

【００１８】Ｓ３は、文型パタンを抽出する。これは、
Ｓ２で形態素解析した結果に対して、文型パタン抽出規
則４を適用し、文型パタンの抽出を行う。例えば図３の
（ａ）で形態素解析した結果に対して、図４の文型パタ
ン抽出規則４を繰り返し適用し、図３の（ｂ）、
（ｃ）、（ｄ）、（ｅ）に示すように、これ以上適用で
きなくなった図３の（ｅ）の文型パタンを抽出する。例
えば図３の（ａ）の形態素解析した結果に対して文型パ
タン抽出規則４を繰り返し適用し、図３の（ｂ）、
（ｃ）、（ｄ）、（ｅ）に示すように置き換える。以下
具体的に説明する。In step S3, a sentence pattern is extracted. this is,
The sentence pattern extraction rule 4 is applied to the result of the morphological analysis in S2 to extract the sentence pattern. For example, the sentence pattern extraction rule 4 in FIG. 4 is repeatedly applied to the result of the morphological analysis in FIG.
As shown in (c), (d), and (e), the sentence pattern pattern of (e) in FIG. 3 that can no longer be applied is extracted. For example, sentence pattern pattern extraction rule 4 is repeatedly applied to the result of the morphological analysis of FIG.
The replacement is performed as shown in (c), (d), and (e). This will be specifically described below.

【００１９】（１）親クラスは、“普通名詞”であるので“名詞類”に置き換え、更
に、当該普通名詞を任意の表層表現（＊）で表現し、結
果として（名詞類，＊）と置き換える。(1) Since the parent class is "ordinary noun", it is replaced with "noun class", and the ordinary noun is expressed by an arbitrary surface expression (*), resulting in (noun class, *) replace.

【００２０】（２）のは、“助詞”であるので“助詞”に置き換え、更に、当
該助詞をそのまま「の」で表現し、結果として（助詞，
「の」）と置き換える。[0020] (2) is given, because it is "particle" replaced with "particle", and further, to represent the particle as it is "no", as a result (particle,
"No").

【００２１】（３）スロットは、“普通名詞”であるので“名詞類”に置き換え、更
に、当該普通名詞を任意の表層表現（＊）で表現し、結
果として（名詞類，＊）と置き換える。(3) Since the slot is a "common noun", it is replaced with "noun class", and the common noun is expressed with an arbitrary surface expression (*), and as a result, replaced with (noun class, *). .

【００２２】（４）これら（１）から（３）によっ
て、図３の（ｂ）に示すように、（名詞類，＊）（助
詞，「の」）（名詞類，＊）と置き換える。(4) By these (1) to (3), as shown in (b) of FIG. 3, it is replaced with (noun class, *) (particle, "no") (noun class, *).

【００２３】（５）（４）に対して、図４の（ａ）の
文型パタン抽出規則のの（名詞類，＊）（助詞，
［の］）（名詞類，＊）→（名詞類，Ｎ）を適用し、
（名詞類，Ｎ）と置き換える（図３の（ｃ）、（ｄ）参
照）。(5) In contrast to (4), (nouns, *) (particles, in the sentence pattern pattern extraction rule of FIG. 4A)
[No]) (nouns, *) → (nouns, N)
(Nouns, N) (see (c) and (d) of FIG. 3).

【００２４】（６）同様に、形態素解析したをを（助詞，「を」）に置き換える。[0024] (6) Similarly, replacing the was morphological analysis (the particle, "Wo").

【００２５】（７）同様に、形態素解析した親スロットを（名詞類，Ｎ）に置き換える。(7) Similarly, the morphologically analyzed parent slot is replaced with (noun class, N).

【００２６】（８）同様に、形態素解析したとを（助詞，「と」）に置き換える。[0026] (8) Similarly, replacing the capital that was morphological analysis (the particle, "and").

【００２７】（９）同様に、形態素解析した呼ぶを（動詞「呼ぶ」）に置き換える。(9) Similarly, the morphologically analyzed call is replaced with (verb "call").

【００２８】（１０）以上の（１）から（９）の置き
換えによって、図３の（ｄ）に示すように、（名詞類，
Ｎ）（助詞，［を］）（名詞類，Ｎ）（助詞，「と」）
（動詞，「呼ぶ」）に置き換える。(10) By replacing the above (1) to (9), as shown in (d) of FIG.
N) (particles, [wa]) (nouns, N) (particles, "to")
(Verb, "call")

【００２９】（１１）（１０）は定義文であるので、
ＮをＮと呼ぶ。と変換し、求める文型パタンを
自動抽出する。(11) Since (10) is a definition sentence,
N is called N. And the sentence pattern that you want is automatically extracted.

【００３０】Ｓ４は、１文に対応する文型パタンの蓄積
を行う。これは、図１の文型パタン５として蓄積する。
Ｓ５は、終か判別する。これは、例えば分野別の多量の
文について全て文型パタンの抽出を終か判別する。ＹＥ
Ｓの場合には、Ｓ６に進む。ＮＯの場合には、Ｓ１に戻
り、次の１文を取り込み、Ｓ２以降を繰り返す。In step S4, a sentence pattern corresponding to one sentence is stored. This is stored as the sentence pattern 5 in FIG.
In S5, it is determined whether it is the end. This determines whether or not the extraction of the sentence pattern is completed for a large number of sentences in each field, for example. YE
If S, go to S6. In the case of NO, the process returns to S1, the next one sentence is fetched, and S2 and subsequent steps are repeated.

【００３１】Ｓ６は、文の種類毎にまとめる。これは、
Ｓ４で蓄積した文型パタンについて、文の種類（例えば
定義文、方法文など）毎にまとめる。Ｓ７は、まとめた
結果を統合する。これは、文の種類毎にまとめた文型パ
タンの集合、例えば図５の定義文の集合のの４つ、
の４つをそれぞれまとめて、図６の’、’に統合す
る（図６の説明参照）。In step S6, the sentence types are summarized. this is,
The sentence type patterns accumulated in S4 are summarized for each sentence type (for example, definition sentence, method sentence, etc.). S7 integrates the summarized results. This is a set of sentence type patterns collected for each sentence type, for example, four of the set of definition sentences in FIG.
4 are integrated and integrated into “,” in FIG. 6 (see the description of FIG. 6).

【００３２】以上によって、多量の分野別の文を１文ず
つ形態素解析、図４の文型パタン抽出規則を適用して図
５の文型パタンを抽出し、この図５の文型パタンの集合
をまとめて図６の統合した文型パタンを自動生成する。
これにより、多量の文から文型パタンを自動抽出するこ
とが可能となった。As described above, morphological analysis is performed on a large number of sentences in each field one by one, the sentence pattern pattern extraction rule of FIG. 4 is applied to extract the sentence pattern pattern of FIG. 5, and the set of sentence pattern patterns of FIG. The integrated sentence pattern of FIG. 6 is automatically generated.
This makes it possible to automatically extract sentence patterns from a large number of sentences.

【００３３】図３は、本発明の具体例説明図を示す。図
３の（ａ）は、形態素解析した結果を示す。下線の部分
が形態素解析して分割した文字列（それ以上分割すると
意味をなさなくなる単位）を示す。FIG. 3 is a diagram illustrating a specific example of the present invention. FIG. 3A shows the result of morphological analysis. The underlined portion indicates the character string that has been divided by morphological analysis (a unit that makes no sense if further divided).

【００３４】図３の（ｂ）は、形態素解析した結果のう
ち、親クラスのスロットについて、普通名詞“親クラス”、助詞“の”、普通名
詞“スロット”をもとに（名詞類，＊）（助詞，
「の」）（名詞類，＊）に置き換えた様子を示す。FIG. 3 (b) shows that among the results of the morphological analysis, the slots of the parent class are based on the common noun "parent class", the particle "no", and the common noun "slot" (nouns, * )(Particle,
It shows a state in which "no") (nouns, *) is replaced.

【００３５】図３の（ｃ）は、図３の（ｂ）に対して、
図４のの文型パタン抽出規則を適用して置き換えた結
果（名詞類，Ｎ）を示す。FIG. 3C is different from FIG. 3B in that
The result (noun class, N) which replaced by applying the sentence pattern extraction rule of FIG. 4 is shown.

【００３６】図３の（ｄ）は、図３の（ｃ）と同様に、
図４の文型パタン抽出規則を適用して置き換えた結果
（名詞類，Ｎ）（助詞，［を］）（名詞類，Ｎ）（助
詞，「と」）（動詞，「呼ぶ」）を示す。FIG. 3D is similar to FIG. 3C,
The results (nouns, N) (particles, [wa]) (nouns, N) (particles, “to”) (verbs, “call”) of the sentence pattern extraction rules of FIG. 4 replaced are shown.

【００３７】図３の（ｅ）は、図３の（ｄ）が定義文で
あるので、ＮをＮと呼ぶ。と変換し、文型パタ
ンに置き換えた様子を示す。In (e) of FIG. 3, (d) of FIG. 3 is the definition sentence, and hence N is referred to as N. It is shown that it is converted to and replaced with a sentence pattern.

【００３８】図４は、本発明の文型パタン抽出規則例を
示す。ここで、＊は何でもよいことを表す。例えば（名
詞類，＊）の＊は名詞類であれば、何でもよいことを表
す。FIG. 4 shows an example of the sentence pattern extraction rule of the present invention. Here, * represents anything. For example, * in (nouns, *) indicates that any noun is acceptable.

【００３９】Ｎは名詞を表す記号である。は、既述し
た図３の（ｂ）に適用して図３の（ｃ）に置き換えたと
きの文型パタン抽出規則である。N is a symbol representing a noun. Is a sentence pattern pattern extraction rule when applied to (b) of FIG. 3 and replaced with (c) of FIG.

【００４０】図５は、本発明の抽出した文型パタン例を
示す。これは、既述した図３のようにして文を形態素解
析して図４の文型パタン抽出規則を適用し、自動抽出し
た文型パタンの集合である。ここで、、の各４つの
定義文を統合すると、図６の’、’の定義文にそれ
ぞれまとめられる。FIG. 5 shows an example of the extracted sentence pattern according to the present invention. This is a set of sentence pattern patterns automatically extracted by applying the sentence pattern extraction rules of FIG. 4 by morphologically analyzing the sentence as shown in FIG. Here, when the four definition sentences of and are integrated, they are summarized into the definition sentences of'and 'in FIG.

【００４１】図６は、本発明の統合した文型パタン例を
示す。これは、上述したように、図５の、の文型パ
タンの４つをそれぞれまとめて統合した文型パタンであ
る。この統合した文型パタン中の、［］は、あっても
なくてもよい旨を表す記号である。FIG. 6 shows an example of the integrated sentence pattern according to the present invention. As described above, this is a sentence pattern in which the four sentence pattern patterns of FIG. 5 are integrated and integrated. [] In the integrated sentence pattern is a symbol indicating that it may or may not be present.

【００４２】｜は、ｏｒ（いずれかを）を表す記号であ
る。従って、図６の’の統合した文型パタンのうち、
あってもなくてもよい［］で囲まれたものを除くと、Ｎ
をＮと｛いう｜呼ぶ｜よぶ｝。となり、｜がｏｒである
ので、これは、・ＮをＮという。Is a symbol representing or (any one). Therefore, in the integrated sentence pattern of'of FIG.
N, excluding those enclosed in [], which may or may not be present
Is referred to as N. Since | is or, this is: ・ N is called N.

【００４３】・ＮをＮと呼ぶ。・ＮをＮとよぶ。の３つを表すこととなる。* N is called N.・ N is called N. Will be represented.

【００４４】次に、図２の（ｂ）、および図７から図９
を用い、構文解析結果の表示および構文解析規則の編集
について図１の構成の動作を詳細に説明する。図２の
（ｂ）は、構文解析規則の編集フローチャートを示す。Next, FIG. 2B and FIGS. 7 to 9
The operation of the configuration shown in FIG. 1 for displaying the parsing result and editing the parsing rule will be described in detail using. FIG. 2B shows an editing flowchart of the syntax analysis rule.

【００４５】図２の（ｂ）において、Ｓ１１は、１文の
取り込みを行う。これは、図１のテキストベース部１か
ら構文解析対象のテキスト（文）群から１文の取り込み
を行う。In FIG. 2B, S11 fetches one sentence. This takes in one sentence from the text (sentence) group to be parsed from the text base unit 1 in FIG.

【００４６】Ｓ１２は、構文解析する。これは、Ｓ１１
で取り込んだ１文を形態素解析した後、構文解析規則８
を適用して構文解析を行う。Ｓ１３は、構文解析結果を
表示する。これは、例えば後述する図８に示すように、
画面上に構文解析結果およびそのときに適用した構文解
析規則８を対応づけて表示する。In S12, the syntax is analyzed. This is S11
After morphological analysis of one sentence captured by
Is applied for parsing. S13 displays the syntax analysis result. For example, as shown in FIG. 8 described later,
The syntax analysis result and the syntax analysis rule 8 applied at that time are displayed in association with each other on the screen.

【００４７】Ｓ１４は、Ｓ１３で表示した構文解析結果
が良好かユーザが判別する。ＹＥＳの場合には、１文の
構文解析結果が良好であったので、Ｓ１１に戻り、次の
１文の取り込みを行う。一方、ＮＯの場合には、１文の
構文解析結果が良好でなかったので、Ｓ１５でユーザが
構文解析規則８の修正を行う、即ち図８で画面上に表示
された構文解析結果、そのときに使用した構文解析規則
８と、このときの文に対応する文型パタンとを併せて表
示させておき、ユーザが両者を比較し、構文解析結果が
良好となるように、このときに使用した構文解析規則８
を修正する。In S14, the user determines whether the syntactic analysis result displayed in S13 is good. In the case of YES, the syntactic analysis result of one sentence is good, so that the process returns to S11 to fetch the next one sentence. On the other hand, in the case of NO, since the syntactic analysis result of one sentence is not good, the user corrects the syntactic analysis rule 8 in S15, that is, the syntactic analysis result displayed on the screen in FIG. The syntactic analysis rule 8 used for the above and the sentence pattern corresponding to the sentence at this time are displayed together, and the user compares the two so that the syntactic analysis result is good, and the syntax used at this time is displayed. Parsing rule 8
To fix.

【００４８】Ｓ１６は、Ｓ１５で修正した構文解析規則
８について、文型パタン５と照合、即ち構文解析対象と
なっている文の文型パタンと照合して適合しているか判
別する。ＹＥＳの場合には、適合しているので、Ｓ１２
に戻り、修正後の構文解析規則８を使用し、構文解析を
再実行し、その構文解析結果およびそのときに使用した
構文解析規則を対応づけて画面上にＳ１３で表示する。
そして、Ｓ１４で良好となれば、この修正した構文解析
規則８を保存する。そして、Ｓ１１に戻り、次の１文の
取り込みに進む。一方、Ｓ１７でＮＯの場合には、ユー
ザに再度、構文解析規則の修正のやり直しを促すメッセ
ージを表示し、Ｓ１５に戻る。In S16, the syntactic analysis rule 8 corrected in S15 is collated with the sentence pattern 5, that is, collated with the sentence pattern of the sentence to be syntactically analyzed to determine whether or not the sentence is correct. In the case of YES, since it is suitable, S12
The syntactic analysis is performed again using the corrected syntactic analysis rule 8, and the syntactic analysis result and the syntactic analysis rule used at that time are associated and displayed in S13 on the screen.
Then, if the result of S14 is good, the corrected parsing rule 8 is stored. Then, the process returns to S11 to proceed to fetch the next one sentence. On the other hand, if NO in S17, a message prompting the user to redo the correction of the syntax analysis rule is displayed again, and the process returns to S15.

【００４９】以上によって、自然言語文を１文つづ取り
込んで構文解析規則８を適用して構文解析結果およびそ
のときに使用した構文解析規則８を対応づけて画面上に
表示し、ユーザが構文解析結果が良好と判断したときに
次の１文の取り込みに進み、一方、構文解析結果が良好
でないと判断したときは文に対応する文型パタンと、使
用した構文解析規則８とを比較し、より望ましい構文解
析結果が得られるようにユーザが構文解析規則を修正す
る。そして、修正後の構文解析規則を使用して構文解析
を行い、構文解析結果および使用した構文解析規則を対
応づけて表示させ、良好であれば構文解析規則が修正さ
れたので、これを保存する。以下順次詳細に説明する。As described above, the natural language sentences are captured one by one, the syntactic analysis rule 8 is applied, the syntactic analysis result and the syntactic analysis rule 8 used at that time are associated and displayed on the screen, and the syntactic analysis is performed by the user. When it judges that the result is good, it proceeds to fetch the next one sentence. On the other hand, when it judges that the result of parsing is not good, it compares the sentence pattern pattern corresponding to the sentence with the parsing rule 8 used, and The user modifies the parsing rules to obtain the desired parsing results. Then, parsing is performed using the corrected parsing rule, the parsing result and the used parsing rule are displayed in association with each other. If the parsing rule is good, the parsing rule has been corrected, and this is saved. . The details will be sequentially described below.

【００５０】図７は、本発明の構文解析規則の修正フロ
ーチャートを示す。図７において、Ｓ２１は、解析結果
を表示する。これは、例えば図８の構文解析結果と示す
ように、予め作成した構文解析規則８を適用して生成し
た構文解析結果を図的に表示する。FIG. 7 shows a modification flowchart of the parsing rule of the present invention. In FIG. 7, S21 displays the analysis result. This graphically displays the syntactic analysis result generated by applying the syntactic analysis rule 8 created in advance, as shown by the syntactic analysis result of FIG.

【００５１】Ｓ２２は、例文の種別をキーとして蓄積さ
れた文型パタンから関連する文型パタンを取り出して表
示する。これは、図８の例文の種別（例えば定義文）を
キーに既述したように文型パタン抽出規則４を適用して
抽出した文型パタンを含む統合した文型パタンを取り出
して表示する。In S22, the related sentence pattern is extracted from the accumulated sentence patterns using the type of the example sentence as a key and displayed. This extracts and displays the integrated sentence pattern including the sentence pattern extracted by applying the sentence pattern extraction rule 4 as described above using the type of the example sentence (eg, definition sentence) of FIG. 8 as a key.

【００５２】Ｓ２３は、解析結果に表示された構文解析
規則８の規則番号をキーとして関連する構文解析規則を
取り出して並べて表示する。これは、Ｓ２１で表示した
構文解析結果について、使用した構文解析規則を取り出
して図８の使用した構文解析規則と記載したように表示
する。In step S23, the parsing rules related to the parsing rule 8 displayed in the parsing result are used as keys to extract the related parsing rules and display them side by side. This is the same as the syntax analysis rule used in FIG. 8 obtained by extracting the syntax analysis rule used for the syntax analysis result displayed in S21.

【００５３】Ｓ２４は、両者（文型パタンおよび構文解
析規則８）を比較・検討してよりよい構文解析規則をユ
ーザが作成し、それをもとに構文解析を行って構文解析
結果を修正する。In step S24, the user (sentence pattern and syntactic analysis rule 8) is compared and examined to create a better syntactic analysis rule, and the syntactic analysis is performed based on this to correct the syntactic analysis result.

【００５４】Ｓ２５は、修正後の構文解析結果を表示す
る。これらＳ２４、Ｓ２５は、例えば図８の構文解析結
果に使用して構文解析規則８を修正し、再度構文解析を
行って図９の構文解析結果のように修正して表示する。In step S25, the corrected syntactic analysis result is displayed. These S24 and S25 are used, for example, in the syntactic analysis result of FIG. 8 to correct the syntactic analysis rule 8, perform syntactic analysis again, and correct and display the syntactic analysis result of FIG.

【００５５】Ｓ２６は、ＯＫか判別する。これは、Ｓ２
５で修正して表示した構文解析結果がＯＫか判別する。
ＹＥＳの場合には、修正できたので、その構文解析規則
８を保存して終了する。一方、ＮＯの場合には、Ｓ２３
に戻り繰り返す。In S26, it is determined whether it is OK. This is S2
It is determined whether the syntax analysis result corrected and displayed in step 5 is OK.
In the case of YES, the parsing rule 8 is saved because it has been corrected, and the process ends. On the other hand, if NO, S23
Return to and repeat.

【００５６】以上によって、自然言語文に構文解析規則
８を適用して構文解析を行ってその構文解析結果を表示
およびそのときに使用した構文解析規則を対応づけて表
示すると共に、当該自然言語文の文型パタンを併せて表
示し、ユーザが文型パタンと使用した構文解析規則の両
者を比較・検討し、構文解析規則を修正する。そして、
修正後の構文解析規則を使用して再度、構文解析を行っ
てその構文解析結果および使用した構文解析規則を表示
することを繰り返し、所望の構文解析結果が得られるよ
うに構文解析規則を修正し、修正できたときにその構文
解析規則を保存する。これらにより、構文解析時に必要
な構文解析規則について、文型パタンを参照して容易に
構文解析規則を修正・作成することが可能となった。As described above, the syntactic analysis rule 8 is applied to the natural language sentence, the syntactic analysis is performed, the syntactic analysis result is displayed, and the syntactic analysis rule used at that time is displayed in association with the natural language sentence. The sentence pattern pattern of is also displayed, and the user compares and examines both the sentence pattern pattern and the parsing rule used, and corrects the parsing rule. And
Repeat the parsing again using the modified parsing rule and display the parsing result and the used parsing rule, and modify the parsing rule to obtain the desired parsing result. , Save the parsing rule when you can modify it. As a result, it became possible to easily modify and create the parsing rules necessary for parsing by referring to the sentence pattern.

【００５７】図８は、本発明の文の構文解析結果例を示
す。ここで、構文解析結果の欄の上段は入力された自然
言語文を示し、下線は形態素解析した結果を示し、３行
目以降は構文解析結果の木構造およびそのときに使用し
て構文解析規則の番号、、を示す。FIG. 8 shows an example of the syntax analysis result of the sentence of the present invention. Here, the upper part of the column of the syntactic analysis result shows the input natural language sentence, the underline shows the result of the morphological analysis, and the third and subsequent lines show the tree structure of the syntactic analysis result and the syntactic analysis rule used at that time. No. of

【００５８】使用した構文解析規則の欄は、構文解析結
果の欄で使用して構文解析規則の番号、、に対応
する構文解析規則を示す。文型パタンは、自然言語文の
文型パタンを示す。The used parsing rule column indicates the parsing rule corresponding to the parsing rule number used in the parsing result column. The sentence pattern indicates a sentence pattern of a natural language sentence.

【００５９】図９は、本発明の文の修正した構文解析結
果例を示す。これは、図８の構文解析結果が非所望であ
ったので、ユーザが文型パタンを参照して構文解析規則
を図示のように修正したことに対応して、生成された構
文解析結果を示す。図中のから、（１０）、（１
１）は、修正した構文解析規則に同一番号にそれぞれ対
応している。この図９の構文解析結果はＯＫであるの
で、このときに使用した構文解析規則を保存する。FIG. 9 shows an example of the corrected syntactic analysis result of the sentence of the present invention. This shows the generated syntactic analysis result in response to the fact that the syntactic analysis result of FIG. 8 is undesired and the user has modified the syntactic analysis rule as shown by referring to the sentence pattern. In the figure, (10), (1
1) corresponds to the same number as the corrected parsing rule. Since the parsing result of FIG. 9 is OK, the parsing rule used at this time is saved.

【００６０】[0060]

【発明の効果】以上説明したように、本発明によれば、
対象とする範囲の多量のテキストを用意し、その全体に
亘って文型パタン抽出して統合したり、文型パタンを参
照して構文解析規則８を修正して望ましい構文解析結果
を生成できるように編集したりする構成を採用している
ため、多量の自然言語文から自動的に全ての文型パタン
を収集することができると共に、文型パタンの支援を受
けて構文解析規則を容易に修正・作成することができ
る。As described above, according to the present invention,
Edit so that a large amount of text in the target range is prepared, sentence pattern patterns are extracted and integrated over the entire text, or parsing rule 8 is modified by referring to the sentence pattern to generate a desired syntactic analysis result. Because it adopts a configuration that allows you to automatically collect all sentence pattern from a large amount of natural language sentences, you can easily modify and create parsing rules with the support of sentence pattern. You can

[Brief description of drawings]

【図１】本発明の原理ブロック図である。FIG. 1 is a principle block diagram of the present invention.

【図２】本発明の全体動作説明フローチャートである。FIG. 2 is a flowchart for explaining the overall operation of the present invention.

【図３】本発明の具体例説明図である。FIG. 3 is a diagram illustrating a specific example of the present invention.

【図４】本発明の文型パタン抽出規則例である。FIG. 4 is an example of a sentence pattern pattern extraction rule of the present invention.

【図５】本発明の抽出した文型パタン例である。FIG. 5 is an example of an extracted sentence pattern according to the present invention.

【図６】本発明の統合した文型パタン例である。FIG. 6 is an example of an integrated sentence pattern according to the present invention.

【図７】本発明の構文解析規則の修正フローチャートで
ある。FIG. 7 is a flowchart for modifying the parsing rule of the present invention.

【図８】本発明の文の構文解析結果例である。FIG. 8 is an example of a syntactic analysis result of a sentence according to the present invention.

【図９】本発明の文の修正した構文解析結果例である。FIG. 9 is an example of a corrected syntactic analysis result of the sentence of the present invention.

[Explanation of symbols]

１：テキストベース部２：形態素解析部３：文型パタン抽出部４：文型パタン抽出規則５：文型パタン６：文型パタン統合部７：構文解析部８：構文解析規則９：結果表示・編集部 1: Text base part 2: Morphological analysis part 3: Sentence pattern pattern extraction part 4: Sentence pattern pattern extraction rule 5: Sentence pattern pattern 6: Sentence pattern pattern integration part 7: Syntax analysis part 8: Syntax analysis rule 9: Result display / editing part

Claims

[Claims]

1. A morphological analysis unit (2) for morphologically analyzing a large amount of text in a sentence pattern extracting apparatus for automatically extracting sentence patterns from text.
And a sentence pattern extracting unit (3) for extracting sentence patterns by applying a sentence pattern extraction rule (4) created in advance to the result of morphological analysis by the morphological analyzing unit (2).
And a sentence pattern integrating unit (6) for integrating a set of sentence patterns extracted by the sentence pattern extracting unit (3).

2. A natural language sentence is subjected to a syntactic analysis process by applying a syntactic analysis rule (8) created in advance, and the syntactic analysis result is displayed and the syntactic analysis used to obtain the syntactic analysis result. The syntactic analysis unit (7) for displaying the rule (8) in association with the corresponding portion of the syntactic analysis result and the integrated sentence pattern related to the natural language sentence are extracted and displayed together, and used. In response to the modification instruction of the syntax analysis rule (8), it is determined whether or not the integrated sentence pattern is matched, and when matched, the modified syntax analysis rule (8) is used as the syntax analysis unit (7). ), The parsing process is performed, the parsing result is displayed, and the parsing rule (8) used to obtain the parsing result is displayed in association with the parsing rule (8). ) The sentence pattern extracting device according to claim 1, further comprising a result display / editing unit (9) for collecting and editing the syntax analysis rule (8).