JPH0756919A

JPH0756919A - Japanese analysis method

Info

Publication number: JPH0756919A
Application number: JP5201655A
Authority: JP
Inventors: Satoshi Shirai; 諭白井
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1993-08-13
Filing date: 1993-08-13
Publication date: 1995-03-03

Abstract

PURPOSE:To provide a Japanese analysis method by which the modification relation of clauses as against a long Japanese sentence where the number of the clauses is large is efficiently approved and the equivocal generation of modification can be made minimum. CONSTITUTION:A morpheme analysis part 2 refers to a morpheme dictionary 5 and word-divides an input sentence so as to decide the clauses. A modification analysis part 3 refers to a modification dictionary 6 and extracts the collection of more than one clauses constituting a predicate clause. The predicate clause is classified stepwise in accordance with the strength of independency so that it is adjusted to the expression structure of Japanese, and the structure of the input sentence is analyzed based on the strength of the independency of the predicate clause. The modification relation of the respective clauses is approved on the input sentence which is structure-analyzed, and the structure of the Japanese input sentence is analyzed.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、文節間の係り受け関係
を認定することにより入力文の構文解析を行う日本語解
析方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a Japanese parsing method for parsing an input sentence by recognizing a dependency relation between clauses.

【０００２】[0002]

【従来の技術】従来、日本文における構文解析として
は、各文節の働きを分類し、この分類に従って、いわゆ
る句構造文法に基づく係り受けの一般的法則、すなわ
ち、（１）係りの文節は受けの文節より前に置かれる、
（２）係りの文節は受けの文節を１つしか持ちえない、
（３）係り受けの関係に交差を生じない、を用いて、各
文節間の係り受けを試行錯誤しながら認定することによ
り、文全体の係り受けパターンを決定する方法がよく知
られている。2. Description of the Related Art Conventionally, as a syntactic analysis in Japanese sentences, the action of each bunsetsu is classified, and according to this classification, a general rule of dependency based on a so-called phrase structure grammar, that is, (1) Placed before the clause
(2) The related bunsetsu can have only one receiving bunsetsu,
A well-known method is to determine the dependency pattern of the entire sentence by (3) recognizing the dependency between each clause by trial and error by using that the dependency relationship does not intersect.

【０００３】また、一般的法則だけでは長い文（１０文
節以上）に対する解析精度が極端に低くなるため、最
近、文節列の類似性に着目し、句または節の並列構造を
検出することにより解析精度の向上を狙う方法（黒橋、
長尾、「長い日本語文における並列構造の推定（情報処
理学会の研究会資料ＮＬ−８６−２，１９９１年１１月
１５日）」および「並列構造の検出に基づく長い日本語
文の構文解析（情報処理学会の研究会資料ＮＬ−８８−
１，１９９２年３月１２日）」）が提案され、解析多義
を出さないという厳しい条件下で、平均１４．２文節の
文に対し、解析精度６６％を達成している。Further, since the analysis accuracy for a long sentence (10 phrases or more) is extremely low only by the general law, recently, by paying attention to the similarity of the phrase sequence, the parallel structure of the phrases or the clauses is detected. Method for improving accuracy (Kurohashi,
Nagao, "Estimation of Parallel Structures in Long Japanese Sentences (Information Processing Society of Japan Material NL-86-2, November 15, 1991)" and "Parsing of Long Japanese Sentences Based on Detection of Parallel Structures (Information Processing Academic Conference Material NL-88-
1, March 12, 1992) ”) was proposed, and achieved an analysis accuracy of 66% for an average of 14.2 bunsetsu sentences under the severe condition of not making ambiguity in analysis.

【０００４】[0004]

【発明が解決しようとする課題】上述の係り受けの一般
的法則では長い文の解析は極めて困難であり、また、上
述の黒橋らの方法は文節列の類似性が見いだせない場
合、効果を発揮することができないという基本的な問題
が考えられる。The above general rule of dependency makes it very difficult to analyze a long sentence, and the above-mentioned method of Kurohashi et al. Is effective when the similarity of bunsetsu sequences cannot be found. There is a basic problem that it cannot exert its effect.

【０００５】例えば、１４文節の文「出版取次は／もと
もと／利益率が／低い／ことに／加えて、／出版物の／
需要が／鈍化している／ため／苦しい／経営を／余儀な
く／されている。（日経産業新聞、／は文節境界を表
す）」には述語が５つ含まれるが（「低い」「加え
て、」「鈍化している」「苦しい」「されてい
る。」）、節の並列の手がかりになる文節列の類似性は
なく、しかも、「低い／ことに／加えて」や「鈍化して
いる／ため」は述語句を形成しているので、従来の方法
のように文節単位で係り受け関係を認定しようとする正
解が得られない場合があると考えられる。For example, a sentence of 14 clauses "publishing agency / original / profit margin / low /// in addition / publishing /
Demand is slowing / because / suffering / management / is forced / is being done. "(Nikkei Sangyo Shimbun, / means phrase boundary)" contains five predicates ("low", "additionally", "blunted", "painful", "is done"), but There is no parallelism clues to parallel clues, and "low / additional / additional" and "blunting / because" form a predicate phrase, so the bunsetsu is similar to the conventional method. In some cases, the correct answer may not be obtained in order to recognize the dependency relationship in units.

【０００６】また、２０文節の文「（××社は／）水道
事業での／ビリングマシン導入の／時代から／自治体へ
の／納入実績が／あるが、／その後／オフコンに／なっ
て／ライバル他社に／押されぎみに／なっており，／ソ
フトウェアを／体系的に／見直すとともに／プロジェク
トチームを／強化，／自治体市場での／失地回復を／狙
う。（日経産業新聞）」には、述語句が６つ含まれるが
（「あるが、」「なって」「なっており、」「見直すと
ともに」「強化、」「狙う。」）、やはり節の並列の手
がかりになる文節列の類似性はなく、述語間の係り受け
関係に多義の爆発が予想される。[0006] In addition, a sentence of 20 clauses "(XX company /) / in the water supply business / introducing a billing machine / from the era / to the local government / delivery record / is, but // after / to become an office computer // It is / is being pushed by competitors / is being / pushed / software / systematic / reviewing / project team / strengthening / / in the local government market / recovery from lost land / aiming. (Nikkei Sangyo Shimbun) , 6 predicate phrases are included (“Aru ga,” “Naru”, “Nata,” “Revisit,” “Strengthen,” “Aim.”), But they are also used as clues for parallel clauses. There is no similarity, and a polymorphic explosion is expected in the dependency relationship between predicates.

【０００７】本発明は、上記に鑑みてなされたもので、
その目的とするところは、文節数が多く長い日本文に対
する文節間の係り受け関係の認定を効率的に行い、係り
方の多義の発生を極小化し得る日本語解析方法を提供す
ることにある。The present invention has been made in view of the above,
It is an object of the present invention to provide a Japanese analysis method capable of efficiently recognizing a dependency relation between bunsetsu for a Japanese sentence having a large number of bunsetsu and minimizing the occurrence of polysemy of the bunsetsu.

【０００８】[0008]

【課題を解決するための手段】上記目的を達成するた
め、本発明の日本語解析方法は、日本語により表現され
る入力文に含まれる単語を認定し、１個以上の単語から
なるまとまりである文節を基本単位として文節間の係り
受け関係を認定することにより入力文の構文解析を行う
日本語解析方法であって、述語句を構成する１個以上の
文節のまとまりを抽出し、日本語の表現構造に合うよう
に前記述語句を独立性の強さに応じて段階的に分類し、
前記述語句の独立性の強さに基づいて入力文の構造を解
析し、該構造解析された入力文について更に各文節間の
係り受け関係を認定して、日本語入力文を構文解析する
ことを要旨とする。In order to achieve the above object, the Japanese analysis method of the present invention recognizes a word included in an input sentence expressed in Japanese, and recognizes it as a group consisting of one or more words. A Japanese parsing method that parses an input sentence by recognizing a dependency relation between bunsetsu using a certain bunsetsu as a basic unit, and extracts a group of one or more bunsetsus that form a predicate phrase, The predecessor words are classified in stages according to the strength of independence so as to match the expression structure of
Analyzing the structure of the input sentence based on the degree of independence of the predescription phrase, further recognizing the dependency relation between each clause in the input sentence subjected to the structure analysis, and parsing the Japanese input sentence. Is the gist.

【０００９】[0009]

【作用】本発明の日本語解析方法では、入力文から述語
句を構成する１個以上の文節のまとまりを抽出し、日本
語の表現構造に合うように前記述語句を独立性の強さに
応じて段階的に分類し、前記述語句の独立性の強さに基
づいて入力文の構造を解析し、該構造解析された入力文
について更に各文節間の係り受け関係を認定して、日本
語入力文を構文解析する。According to the Japanese analysis method of the present invention, a group of one or more clauses forming a predicate phrase is extracted from an input sentence, and the predescription phrase is made to have a high degree of independence so as to match the expression structure of Japanese. According to the degree of independence of the predescription phrase, the structure of the input sentence is analyzed according to the degree of independence, and the dependency relation between the clauses of the input sentence subjected to the structure analysis is further recognized. Parse a word input sentence.

【００１０】[0010]

【実施例】以下、図面を用いて本発明の実施例を説明す
る。Embodiments of the present invention will be described below with reference to the drawings.

【００１１】図１は、本発明の一実施例に係わる日本語
解析方法を実施する日本語解析システムの構成を示すブ
ロック図である。同図において、１は日本語の文を入力
するための入力部、２は形態素辞書５を用いて入力文を
単語分割する形態素解析部、３は形態素解析部で得られ
た形態素解析結果から係り受け辞書６を用いて文節間の
係り受け関係を解析する係り受け解析部、４は係り受け
解析結果を出力する出力部を示している。FIG. 1 is a block diagram showing the configuration of a Japanese analysis system for implementing a Japanese analysis method according to an embodiment of the present invention. In the figure, 1 is an input unit for inputting a Japanese sentence, 2 is a morpheme analysis unit that divides an input sentence into words using a morpheme dictionary 5, and 3 is a relation from a morpheme analysis result obtained by the morpheme analysis unit. Dependency analysis units that analyze the dependency relations between phrases using the dependency dictionary 6 and 4 denote output units that output the dependency analysis results.

【００１２】本実施例に示す日本語解析システムの動作
の概要は次の通りである。The outline of the operation of the Japanese analysis system shown in this embodiment is as follows.

【００１３】入力部１から入力された日本語の入力文
は、形態素解析部２で単語分割される。次に、係り受け
解析部３では、形態素解析部２からの出力に基づいて、
係り受け辞書６を用いて文節間の係り受け解析を行う。
この結果は、出力部４から出力される。The Japanese input sentence input from the input unit 1 is divided into words by the morphological analysis unit 2. Next, in the dependency analysis unit 3, based on the output from the morpheme analysis unit 2,
Dependency analysis between phrases is performed using the dependency dictionary 6.
This result is output from the output unit 4.

【００１４】図２は、図１に示すシステムに使用されて
いる係り受け解析部３および係り受け辞書６の詳細な構
成を示すブロック図である。図２に示すように、係り受
け解析部３は述語句認定部３１、述語句分類部３２、述
語句関係認定部３３および係り受け関係認定部３４から
構成され、係り受け辞書６は述語句分類テーブル６１お
よび係り受け認定テーブル６２から構成されている。FIG. 2 is a block diagram showing a detailed configuration of the dependency analysis unit 3 and the dependency dictionary 6 used in the system shown in FIG. As shown in FIG. 2, the dependency analysis unit 3 is composed of a predicate phrase recognizing unit 31, a predicate phrase classifying unit 32, a predicate phrase relationship recognizing unit 33, and a dependency relationship recognizing unit 34, and the dependency dictionary 6 is a predicate phrase classifying unit. It is composed of a table 61 and a dependency approval table 62.

【００１５】係り受け辞書６の述語句分類テーブル６１
は、図３に示すように構成されている。この述語句分類
テーブル６１では、種々の文節の単語構成に対してルー
ル番号１１，１２，・・・および述語句の分類Ｓ₀，Ｓ
₁，・・・が示されている。図３に示す述語句分類テー
ブル６１において、文節の単語構成に示されているＷ₁
は、任意の単語（但し、品詞の条件指定あり）を表し、
＊は任意の０個以上の単語を表している。なお、本来な
ら述語句を構成する部分のみを記述すればよいが、「サ
変名詞＋読点」の形では、例えば「研究、開発」のよう
な並列と区別するため、述語句であることを判定するた
めの条件として直前の文節が格要素であるか否かを指定
している。Predicate phrase classification table 61 of the dependency dictionary 6
Are configured as shown in FIG. In the predicate phrase classification table 61, rule numbers 11, 12, ... And predicate phrase classification S ₀ , S for word configurations of various clauses.
₁ , ... are shown. In the predicate phrase classification table 61 shown in FIG. 3, W ₁ shown in the word structure of the phrase
Represents an arbitrary word (however, the condition of part of speech is specified),
* Represents any zero or more words. Originally, only the part that constitutes the predicate phrase should be described, but in the form of "sahen noun + reading point", it is determined that it is a predicate phrase in order to distinguish it from parallel, such as "research and development". It specifies whether or not the preceding clause is a case element as a condition for doing.

【００１６】また、係り受け辞書６の係り受け認定テー
ブル６２は、図４に示すように構成されている。この係
り受け認定テーブル６２においては、各ルール番号に対
して係り元となる文節と係り先となる文節が対として示
されるとともに、その係り受けの属性が示されている。
なお、｛｝は中に記載された任意の１単語または分類を
使用するものとする。Ｗ₁は任意の単語（但し、品詞の
条件指定あり）を表し、＊は任意の０個以上の単語（品
詞の条件指定を伴う場合あり）を表している。更に、述
語句、は真近の「係り先となる文節」の条件を満たす述
語句に係るものとする。The dependency approval table 62 of the dependency dictionary 6 is configured as shown in FIG. In the dependency approval table 62, a clause as a dependency source and a clause as a dependency destination are shown as a pair for each rule number, and attributes of the dependency are shown.
In addition, {} shall use any one word described in or classification. W ₁ represents an arbitrary word (however, the condition of the part of speech is specified), and * represents 0 or more arbitrary words (may be accompanied by the condition specification of the part of speech). Furthermore, it is assumed that the predicate phrase is a predicate phrase that satisfies the condition of the most recent "related clause".

【００１７】次に、図５に示すフローチャートを参照す
るとともに、図６および図７に示す解析結果を参照し
て、図１に示す実施例の作用を説明する。Next, the operation of the embodiment shown in FIG. 1 will be described with reference to the flow chart shown in FIG. 5 and the analysis results shown in FIGS. 6 and 7.

【００１８】まず、入力文が「出版取次はもともと利益
率が低いことに加えて、出版物の需要が鈍化しているた
め苦しい経営を余儀なくされている」である場合につい
て説明する。この入力文は、図１に示す入力部１を介し
て形態素解析部２に入力され、この形態素解析部２にお
いて形態素辞書５を参照しながら単語分割され、図６に
示すように文節１〜１４からなる形態素解析結果が得ら
れる（図５のステップ１１０）。First, a case will be described in which the input sentence is "Inherently, the profit ratio is low in the publication agency, and the demand for the publication is slowing, so that the management is difficult". This input sentence is input to the morpheme analysis unit 2 via the input unit 1 shown in FIG. 1, and the morpheme analysis unit 2 divides the words while referring to the morpheme dictionary 5, and as shown in FIG. A morphological analysis result is obtained (step 110 in FIG. 5).

【００１９】この形態素解析結果は、係り受け解析部３
に供給され、係り受け解析部３の述語句認定部３１にお
いて述語句分類テーブル６１を参照して、述語句のまと
まりが検出される（ステップ１２０）。この例では、文
節４〜７の形態素解析結果が述語句分類テーブル６１の
ルール７１の文節の単語構成と一致し、文節９〜１０の
形態素解析結果が述語句分類テーブル６１のルール５１
の文節の単語構成と一致し、文節１４の形態素解析結果
が述語句分類テーブル６１のルール１１の文節の単語構
成と一致するので、この結果を述語句分類部３２に供給
する。述語句分類部３２は、この結果に対して述語句分
類テーブル６１を参照して、述語句の分類を決定し、図
６に示すように述語句の分類Ｓ₅，Ｓ₄，Ｓ₀を付与す
る（ステップ１３０）。なお、述語句認定部３１におけ
る述語句分類テーブル６１を参照した述語句のまとまり
の検出は、図６に示すような形態素解析結果の各述語句
を図３の述語句分類テーブル６１の文節の単語構成のす
べてと順次比較し、両者が一致するか否かを判定するこ
とにより行われる。The result of this morphological analysis is the dependency analysis unit 3
And the predicate phrase recognizing unit 31 of the dependency analyzing unit 3 refers to the predicate phrase classification table 61 to detect a group of predicate phrases (step 120). In this example, the morpheme analysis result of the clauses 4 to 7 matches the word structure of the clause of the rule 71 of the predicate phrase classification table 61, and the morpheme analysis result of the clauses 9 to 10 is the rule 51 of the predicate phrase classification table 61.
The morpheme analysis result of the bunsetsu 14 matches the word structure of the bunsetsu of the rule 11 of the predicate phrase classification table 61. Therefore, the result is supplied to the predicate phrase classifying unit 32. The predicate phrase classification unit 32 refers to the predicate phrase classification table 61 for this result, determines the classification of the predicate phrase, and assigns the predicate phrase classification S ₅ , S ₄ , and S ₀ as shown in FIG. (Step 130). Note that the predicate phrase recognizing unit 31 detects a group of predicate phrases with reference to the predicate phrase classification table 61 by detecting each predicate phrase of the morphological analysis result as shown in FIG. This is performed by sequentially comparing all the configurations and determining whether the two match.

【００２０】上述したように、述語句の分類を行った結
果は述語句関係認定部３３に供給され、この述語句関係
認定部３３において係り受け認定テーブル６２を参照し
て、述語句間の係り受けが決定される（ステップ１４
０）。これは、図６の係り受けで示すように、述語句の
分類Ｓ₅を有する文節４〜６からなる述語句の係り先が
図４の係り受け認定テーブル６２からルール１０４によ
り文節９〜１０からなる述語句となり、また述語句の分
類Ｓ₄を有する文節９，１０からなる述語句の係り先が
ルール１０３により文節１４となることが決定される。
なお、この係り受けの決定は、各述語句の独立性の強さ
に応じて行われるが、この独立性の強さは、各述語句間
の関係、並列性、同時性、因果関係等により決定され、
例えば各述語句が全く関係ないことを意味している場合
には独立性が高く、また原因理由のような因果関係にあ
る場合には独立性が低いと言える。更に、この係り受け
の決定は、上述したように決定された各述語句の分類、
例えばＳ₅に対しては、係り受け認定テーブル６２の係
り元となる文節のうちで該当するＳ₅がルール１０４に
あることを検索し、この検索した結果の係り先となる文
節の述語句の分類Ｓ₀，Ｓ₁，・・・Ｓ₅を検出するこ
とにより行われる。As described above, the result of classifying the predicate phrases is supplied to the predicate phrase relation recognizing unit 33, and in this predicate phrase relation recognizing unit 33, the relation recognizing table 62 is referred to to determine the relation between the predicate phrases. Receiving is decided (step 14)
0). As shown in the dependency of FIG. 6, the predicate phrase consisting of the clauses 4 to 6 having the predicate phrase classification S ₅ is the dependency authorization table 62 of FIG. The rule 103 determines that the predicate phrase consisting of the clauses 9 and 10 having the predicate phrase classification S ₄ becomes the clause 14 according to the rule 103.
Note that this dependency is determined according to the strength of independence of each predicate phrase. The strength of this independence depends on the relation between each predicate phrase, parallelism, simultaneity, causality, etc. Determined,
For example, it can be said that the independence is high when it means that each predicate phrase is not related at all, and the independence is low when there is a causal relationship such as a cause reason. Further, the determination of this dependency is made by classifying each predicate determined as described above,
For example, for S _5, S ₅ corresponding among the clauses which become relates source dependency certified table 62 to find that in the rule 104, the search result relates destination becomes clause predicates clause It is performed by detecting the classifications S ₀ , S ₁ , ... S ₅ .

【００２１】そして、最後に、係り受け関係認定部３４
が係り受け認定テーブル６２を参照し、残りの要素の係
り受けを決定する（ステップ１５０）。これは、例えば
ルール２０１により文節１の係り先を文節９〜１０に決
定し、ルール３０１により文節２の係り先を文節４にす
るというようにして行われ、図６の最終的な係り受け解
析結果が得られる。Finally, the dependency relationship recognizing unit 34
Refers to the dependency approval table 62 to determine the dependencies of the remaining elements (step 150). This is done, for example, by determining the dependency destination of the clause 1 to be the clauses 9 to 10 according to the rule 201, and changing the dependency destination of the clause 2 to the clause 4 according to the rule 301, and the final dependency analysis of FIG. The result is obtained.

【００２２】次に、入力文が「水道事業でのビリングマ
シン導入の時代から自治体への納入実績があるが、その
後オフコンになってライバル他社に押されぎみになって
おり、ソフトウェアを体系的に見直すとともにプロジェ
クトチームを強化、自治体市場での失地回復を狙う」で
ある場合について説明する。この入力文も、上述の場合
と同様に、形態素解析部２において形態素辞書５を参照
しながら単語分割され、図７に示すように文節１〜２０
からなる形態素解析結果が得られる。Next, the input sentence is "Since the billing machine was introduced in the water supply business, it has been delivered to local governments, but after that it became an office computer and was pushed by rival companies. We will review and strengthen the project team, aiming to recover the lost land in the local government market. " This input sentence is also word-divided in the morpheme analysis unit 2 with reference to the morpheme dictionary 5 as in the case described above, and as shown in FIG.
A morphological analysis result is obtained.

【００２３】この形態素解析結果は、係り受け解析部３
の述語句認定部３１において述語句分類テーブル６１を
参照して、図７に示すような６つの述語句のまとまりが
検出される。そして、図７に示すように、この６つの述
語句に対して、順に文節６をＳ₁（ルール２１）、文節
９をＳ₄（ルール５２）、文節１２をＳ₃（ルール４
１）、文節１５をＳ₆（ルール８１）、文節１７をＳ₃
（ルール４２）、文節２０をＳ₀（ルール１２）と分類
し、図７のように述語句間の係り受けを決定した後、残
りの要素の係り受けを決定することにより、図７の最終
的な係り受け解析結果が得られる。The result of the morphological analysis is the dependency analysis unit 3
The predicate phrase recognizing unit 31 refers to the predicate phrase classification table 61 to detect a group of six predicate phrases as shown in FIG. 7. Then, as shown in FIG. 7, for these six predicate phrases, clause 6 is S ₁ (rule 21), clause 9 is S ₄ (rule 52), and clause 12 is S ₃ (rule 4).
1), clause 15 is S ₆ (rule 81), clause 17 is S ₃
(Rule 42), the clause 20 is classified as S ₀ (Rule 12), and the dependency between the predicate phrases is determined as shown in FIG. 7, and then the dependency of the remaining elements is determined. Dependency analysis results can be obtained.

【００２４】なお、以上の説明は、簡単のため、係り受
け解析における係り方の多義を出さないという前提で行
ったが、係り方の多義を出す場合について若干補足す
る。たとえば、図６の文節１は「名詞＋は」の形をして
いるので、一般則では動詞や形容詞に係るとされ文節
４，６，９，１４の４つの述語が係り先の多義として得
られるが、本発明のようにあらかじめ述語句に認定と分
類を行えば文節９〜１０，１４の２つに限られる。ま
た、述語句自体の係り受けについても、一般則では４つ
の述語の係り受けとして、図８に示す５つのパターンを
考慮しなければならない。Although the above description is based on the premise that the polymorphism of the engagement is not given in the dependency analysis for the sake of simplicity, the case of giving the polysemy of the engagement will be slightly supplemented. For example, since bunsetsu 1 in FIG. 6 has the form of “noun + ha”, the general rule is that it is related to verbs and adjectives, and the four predicates of bunsetsu 4, 6, 9, and 14 are obtained as ambiguous destinations. However, as in the present invention, if the predicate phrase is previously identified and classified, it is limited to the two clauses 9 to 10. Further, regarding the dependency of the predicate phrase itself, the general rule is to consider the five patterns shown in FIG. 8 as the dependency of the four predicates.

【００２５】本発明によれば、述語句が３つになるた
め、考慮すべきパターンは図９に示す２つに減少し、さ
らに、ルールの制約によりそのうちの１つしか許されな
いため、全体としては解析多義の大幅な削減が可能にな
ると考えられる。According to the present invention, since the number of predicate phrases becomes three, the number of patterns to be considered is reduced to two as shown in FIG. Is considered to be able to significantly reduce the ambiguity of analysis.

【００２６】また、図７のように述語が６つあれば、一
般則では４２のパターンを考慮しなければならないのに
対し、本発明ではルールの制約により、１パターンに限
られる。さらに、述語句の係り受けが決定されることに
より、構造上の制約から述語以外の要素の係りの可能性
が大幅に絞り込まれるため、正しい係り受けを見いだす
のが容易になると考えられる。Further, if there are six predicates as shown in FIG. 7, 42 patterns must be taken into consideration in the general rule, whereas the present invention is limited to one pattern due to rule restrictions. Furthermore, by determining the dependency of the predicate phrase, the possibility of the dependency of the elements other than the predicate is greatly narrowed down due to the structural constraint, so it is considered that it is easy to find the correct dependency.

【００２７】[0027]

【発明の効果】以上説明したように、本発明によれば、
入力文から述語句を構成する１個以上の文節のまとまり
を抽出し、日本語の表現構造に合うように前記述語句を
独立性の強さに応じて段階的に分類し、前記述語句の独
立性の強さに基づいて入力文の構造を解析し、該構造解
析された入力文について更に各文節間の係り受け関係を
認定して、日本語入力文を構文解析するので、文節数が
多く長い日本文でも係り受け構造の認定を効率的に行う
ことができ、係り方の多義の発生を極小化することがで
きる。As described above, according to the present invention,
A group of one or more clauses that compose a predicate phrase is extracted from the input sentence, and the predescription phrases are classified stepwise according to the strength of independence so as to match the expression structure of Japanese. The structure of the input sentence is analyzed based on the strength of independence, and the dependency relation between each phrase in the analyzed input sentence is further recognized, and the Japanese input sentence is parsed. It is possible to efficiently recognize the dependency structure even with many long Japanese sentences, and it is possible to minimize the occurrence of ambiguous ways of involvement.

[Brief description of drawings]

【図１】本発明の一実施例に係わる日本語解析方法を実
施する日本語解析システムの構成を示すブロック図であ
る。FIG. 1 is a block diagram showing a configuration of a Japanese analysis system for implementing a Japanese analysis method according to an embodiment of the present invention.

【図２】図１に示す日本語解析システムに使用されてい
る係り受け解析部および係り受け辞書の詳細な構成を示
すブロック図である。FIG. 2 is a block diagram showing a detailed configuration of a dependency analysis unit and a dependency dictionary used in the Japanese analysis system shown in FIG.

【図３】図２に示す述語句分類テーブルの構成を示す図
である。FIG. 3 is a diagram showing a configuration of a predicate phrase classification table shown in FIG.

【図４】図２に示す係り受け認定テーブルの構成を示す
図である。FIG. 4 is a diagram showing a configuration of a dependency approval table shown in FIG. 2.

【図５】図１に示す実施例の作用を示すフローチャート
である。5 is a flowchart showing the operation of the embodiment shown in FIG.

【図６】入力日本文の例に対する解析結果を示す図であ
る。FIG. 6 is a diagram showing an analysis result for an example of an input Japanese sentence.

【図７】入力日本文の例に対する解析結果を示す図であ
る。FIG. 7 is a diagram showing an analysis result for an example of an input Japanese sentence.

【図８】述語の係り受けのパターンを示す説明図であ
る。FIG. 8 is an explanatory diagram showing a pattern of dependency of a predicate.

【図９】述語の係り受けのパターンを示す説明図であ
る。FIG. 9 is an explanatory diagram showing a pattern of dependency of a predicate.

[Explanation of symbols]

１入力部２形態素解析部３係り受け解析部４出力部５形態素辞書６係り受け辞書３１述語句認定部３２述語句分類部３３述語句関係認定部３４係り受け関係認定部６１述語句分類テーブル６２係り受け認定テーブル 1 Input part 2 Morphological analysis part 3 Dependency analysis part 4 Output part 5 Morphological dictionary 6 Dependency dictionary 31 Predicate phrase recognition part 32 Predicate phrase classification part 33 Predicate phrase relation recognition part 34 Dependency relation recognition part 61 Predicate phrase classification table 62 Dependency certification table

Claims

[Claims]

1. A word included in an input sentence expressed in Japanese is identified, and a dependency relation between the phrases is identified by recognizing a dependency relation between the phrases as a basic unit of a phrase consisting of one or more words. A Japanese parsing method that performs syntactic analysis, in which a group of one or more clauses forming a predicate phrase is extracted, and the predescription phrase is graded according to the strength of independence so as to match the Japanese expression structure. And analyze the structure of the input sentence based on the strength of the independence of the predescription phrase, and further certify the dependency relation between each clause in the structure-analyzed input sentence. A Japanese parsing method characterized by parsing.