JPH09212507A

JPH09212507A - Character processor and analytic method for character string

Info

Publication number: JPH09212507A
Application number: JP8044089A
Authority: JP
Inventors: Michio Aizawa; 道雄相澤; Tsuyoshi Yagisawa; 津義八木沢; Minoru Fujita; 稔藤田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1996-02-07
Filing date: 1996-02-07
Publication date: 1997-08-15

Abstract

PROBLEM TO BE SOLVED: To reduce time necessary for analyzing a modification relation between clauses by applying this analytic method only to an inputted character string to which this analytic method is valid with respect to the analytic system of a large arithmetic quantity. SOLUTION: The inputted character string is divided into words by a morpheme analytic part 2 to be held in a holding part 3 and after then a clause is generated from the words by a clause generation part 4 to be held in a holding part 5. Next, the clause held is the holding part 5 is classified by the class by a clause kind judging part 6. Based on the clause class classified by the class, an analytic system judging part 7 judges whether to execute only modification analysis by a modification analytic part 7 or to also execute case relation analysis by a case relation analytic part 9 and parallel structure analysis by a parallel structure analytic part 10 based on the number of declinable words, the number of substantives modifying declinable words and the number of the clauses to specify the analytic system.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は文字処理装置と文字
列の解析方法に関し、より詳しくは日本語の文節間の係
り受け関係を解析する文字処理装置と文字列の解析方法
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character processing device and a character string analyzing method, and more particularly to a character processing device and a character string analyzing method for analyzing a dependency relation between Japanese phrases.

【０００２】[0002]

【従来の技術】日本語ワードプロセッサ等の文字処理装
置において、従来より、日本語の文節間の係り受け関係
を解析する係り受け解析が、音声合成や機械翻訳等のア
プリケーション・ソフトに利用されている。すなわち、
係り受け解析は、例えば、音声合成ではポーズ位置を決
定するために係り受け解析の結果を利用し、機械翻訳で
は主語や目的語を決定するために利用されている。2. Description of the Related Art In a character processing device such as a Japanese word processor, a dependency analysis for analyzing a dependency relation between Japanese phrases has been conventionally used for application software such as voice synthesis and machine translation. . That is,
Dependency analysis is used, for example, in speech synthesis to utilize the result of dependency analysis to determine the pose position, and in machine translation to determine the subject and object.

【０００３】上記係り受け解析の基本は、「連用修飾型
の文節に対しては最も近くの用言にかけ、また連体修飾
型の文節に対しては最も近くの体言にかける」という簡
単な方法が使用され、また、従来の文字処理装置におい
ては、前記係り受け解析の解析精度を上げるために、格
関係解析や並立構造解析といった解析方式を組み合わせ
て使用されるのが一般的である。The basic of the dependency analysis is a simple method of "applying the closest phrase to a conjunction-modifying phrase and the closest phrase to an adjunct-modifying phrase". In addition, in the conventional character processing device, it is general to use a combination of analysis methods such as case relation analysis and parallel structure analysis in order to improve the analysis accuracy of the dependency analysis.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、格関係
解析や並立構造解析は、基本となる係り受け解析と比べ
て計算量が大きいという問題がある。しかも、格関係解
析や並立構造解析を併用して文節間の係り受け解析を行
う文字処理装置においては、格関係解析や並立構造解析
が有効な文章と有効でない文章とがあるにも拘わらず、
これらの解析方式を常に使用して解析を行うため、必要
以上に計算量が大きくなり、特に、インターネット等の
情報通信網から入手される大量の文書に対して実時間で
音声合成や機械翻訳のアプリケーション・ソフトを利用
する場合、計算量の大きさが重大な問題となってきてい
る。However, the case relation analysis and the parallel structure analysis have a problem that the amount of calculation is large as compared with the basic dependency analysis. Moreover, in a character processing device that performs a dependency analysis between bunsetsu by using case relation analysis and parallel structure analysis together, in spite of the fact that the case relation analysis and parallel structure analysis are valid and invalid sentences,
Since these analysis methods are always used for analysis, the amount of calculation becomes unnecessarily large. Especially, for a large amount of documents obtained from information communication networks such as the Internet, it is possible to perform speech synthesis or machine translation in real time. When using application software, the amount of calculation has become a serious problem.

【０００５】本発明はかかる事情に鑑みなされたもので
あって、計算量の大きい解析方式に対しては該解析方式
が有効な入力文字列に対してのみ適用することにより、
文節間の係り受け解析に要する時間を低減することがで
きる文字処理装置と文字列の解析方法を提供することを
目的とする。The present invention has been made in view of the above circumstances, and for an analysis method with a large calculation amount, it is applied only to an input character string for which the analysis method is effective.
An object is to provide a character processing device and a character string analysis method that can reduce the time required for dependency analysis between phrases.

【０００６】[0006]

【課題を解決するための手段】上記目的を達成するため
に本発明に係る文字処理装置は、入力された文字列を形
態素解析して単語に分割する形態素解析手段と、該形態
素解析手段により解析された単語から文節を生成する文
節生成手段と、前記入力された文字列の解析を行う相異
なる複数の文字列解析手段とを備えた文字処理装置にお
いて、前記文節生成手段により生成された文節の内容に
応じて前記複数の文字列解析手段の中から少なくとも１
個以上の文字列解析手段を特定する解析方式特定手段を
有していることを特徴としている。In order to achieve the above object, a character processing apparatus according to the present invention is a morpheme analysis means for morphologically analyzing an input character string and dividing it into words, and an analysis by the morpheme analysis means. In a character processing device comprising a phrase generating means for generating a phrase from a selected word and a plurality of different character string analyzing means for analyzing the input character string, the phrase generated by the phrase generating means is At least one of the plurality of character string analysis means is selected according to the content.
It is characterized in that it has an analysis method specifying means for specifying more than one character string analyzing means.

【０００７】また、好ましくは、上記文字処理装置に加
えて、多数の文節を相異なる複数の文節種に区分する区
分手段と、前記文節生成手段により生成された文節が前
記複数の文節種のうちのいずれの文節種に属するかを判
定する文節種判定手段とを有し、前記解析方式特定手段
は、文節種判定手段により判定された文節種に基づいて
実行されることを特徴としている。Preferably, in addition to the character processing device, a partitioning unit for partitioning a large number of phrases into a plurality of different phrase types, and a phrase generated by the phrase generating unit among the plurality of phrase types. It is characterized in that the analysis method specifying means is executed based on the phrase type determined by the phrase type determining means.

【０００８】さらに、前記複数の文字列解析手段は、少
なくとも文節間の係り受け関係を解析する係り受け解析
手段と、文節間の格関係を解析する格関係解析手段と、
文字列全体の並立構造を解析して抽出する並立構造解析
手段とを含み、前記解析方式特定手段は、これらの文字
列解析手段の中から解析方式を特定することを特徴とし
ている。Further, the plurality of character string analysis means include at least a dependency analysis means for analyzing a dependency relation between clauses, and a case relation analysis means for analyzing a case relation between clauses.
And a parallel structure analysis means for analyzing and extracting the parallel structure of the entire character string, wherein the analysis method specifying means specifies the analysis method from these character string analysis means.

【０００９】また、前記係り受け解析手段は、前記解析
方式特定手段により常時特定されることを特徴とし、前
記格関係解析手段は、前記文字列中の用言の個数が第１
の所定個数以上であって且つ連用修飾型体言の個数が第
２の所定個数以上のときに前記解析方式特定手段により
特定されることを特徴とし、前記並立構造解析手段は、
前記文字列中の文節の個数が第３の所定個数以上のとき
に前記解析方式特定手段により特定されることを特徴と
している。Further, the dependency analysis means is always identified by the analysis method identification means, and the case relation analysis means is characterized in that the number of syllables in the character string is first.
Is specified by the analysis method specifying means when the number of consecutive modified type words is the second predetermined number or more, the parallel structure analysis means,
When the number of clauses in the character string is equal to or larger than a third predetermined number, the analysis method specifying means specifies the clause.

【００１０】さらに、前記格関係解析手段は、前記係り
受け解析手段からの指令に基づき格関係の解析を実行す
ることを特徴とし、前記並立構造解析手段は、前記係り
受け解析手段からの指令に基づき並立構造の解析を実行
することを特徴としている。Further, the case relationship analyzing means executes an analysis of the case relationship based on a command from the dependency analyzing means, and the parallel structure analyzing means receives the command from the dependency analyzing means. It is characterized by executing an analysis of a parallel structure based on it.

【００１１】また、本発明に係る文字列の解析方法は、
入力された文字列を形態素解析して単語に分割する形態
素解析ステップと、該形態素解析ステップにより解析さ
れた単語から文節を生成する文節生成ステップと、前記
入力された文字列の解析を行う相異なる複数の文字列解
析ステップとを含む文字列の解析方法において、前記文
節生成ステップにより生成された文節の内容に応じて前
記複数の文字列解析ステップの中から少なくとも１個以
上の文字列解析ステップを特定する解析方式特定ステッ
プを有していることを特徴としている。The character string analysis method according to the present invention is
A morpheme analysis step of morphologically analyzing an input character string into words, a clause generation step of generating a clause from a word analyzed by the morpheme analysis step, and a different step of analyzing the input character string In a method for analyzing a character string including a plurality of character string analyzing steps, at least one or more character string analyzing steps are selected from the plurality of character string analyzing steps according to the content of the clause generated by the clause generating step. It is characterized by having an analysis method specifying step for specifying.

【００１２】また、好ましくは、上記文字列の解析方法
に加えて、前記文節生成ステップにより生成された文節
が前記複数の文節種のうちのいずれの文節種に属するか
を判定する文節種判定ステップを含み、前記解析方式特
定ステップは、文節種判定ステップにより判定された文
節種に基づいて実行することを特徴としている。Further, preferably, in addition to the character string analyzing method, a phrase type determining step of determining which of the plurality of phrase types the phrase generated in the phrase generating step belongs to. And the analysis method specifying step is executed based on the phrase type determined by the phrase type determining step.

【００１３】さらに、前記複数の文字列解析ステップ
は、少なくとも文節間の係り受け関係を解析する係り受
け解析ステップと、文節間の格関係を解析する格関係解
析ステップと、文字列全体の並立構造を解析して抽出す
る並立構造解析ステップとを含み、前記解析方式特定ス
テップは、これらの文字列解析ステップの中から解析方
式を特定することを特徴としている。Further, the plurality of character string analyzing steps include at least a dependency analyzing step of analyzing a dependency relation between clauses, a case relation analyzing step of analyzing a case relation between clauses, and a parallel structure of the entire character string. And a parallel structure analyzing step of extracting and analyzing, and the analyzing method specifying step specifies the analyzing method from these character string analyzing steps.

【００１４】また、前記解析方式特定ステップは前記係
り受け解析ステップを常時特定することを特徴とし、文
字列中の用言の個数が第１の所定個数以上であって且つ
連用修飾型体言の個数が第２の所定個数以上のときに前
記解析方式特定ステップは前記格関係解析ステップを特
定することを特徴とし、文字列中の文節の個数が第３の
所定個数以上のときに前記解析方式特定ステップは前記
並立構造解析ステップを特定することを特徴としてい
る。Further, the analysis method specifying step is characterized in that the dependency analysis step is always specified, and the number of phrases in the character string is equal to or larger than a first predetermined number and the number of continuous modified type phrases is Is a second predetermined number or more, the analysis method identification step identifies the case relationship analysis step, and when the number of clauses in the character string is a third predetermined number or more, the analysis method identification step is performed. The step is characterized by specifying the parallel structure analysis step.

【００１５】さらに、前記格関係解析ステップは、前記
係り受け解析ステップからの指令に基づき格関係の解析
を実行することを特徴とし、前記並立構造解析ステップ
は、前記係り受け解析ステップからの指令に基づき並立
構造の解析を実行することを特徴としている。Further, the case relation analysis step is characterized in that case relation analysis is executed based on a command from the dependency analysis step, and the parallel structure analysis step is performed in response to a command from the dependency analysis step. It is characterized by executing an analysis of a parallel structure based on it.

【００１６】[0016]

【発明の実施の形態】以下、本発明の実施の形態を図面
に基づいて詳説する。Embodiments of the present invention will be described below in detail with reference to the drawings.

【００１７】図１は本発明に係る文字処理装置の一実施
の形態を示すブロック構成図である。FIG. 1 is a block diagram showing an embodiment of a character processing device according to the present invention.

【００１８】同図において、１は解析対象となる文字列
が入力されるキーボード等の文字入力部である。形態素
解析部２は、文字入力部１により入力された文字列を形
態素解析して該文字列を単語に分割する。形態素解析結
果保持部３は、形態素解析部２で解析された解析結果を
保持する。文節生成部４は前記形態素解析結果保持部３
に保持された保持内容に基づいて文節を生成する。文節
保持部５は文節生成部４で生成された文節を保持する。
文節種判定部６は文節保持部５で保持された文節の夫々
についてその文節種を判定し、文節を文節種毎に区分し
てその結果を文節保持部５に書き込む。解析方式判定部
７は前記文節種判定部６で得られた文節の種類に基づい
て文節保持部５に保持されている文節の解析方式を判定
する。係り受け解析部８は解析方式判定部７で判定され
た解析方式に基づいて文節保持部５に保持されている文
節の係り受けを解析する。格関係解析部９は係り受け解
析部８からの信号に基づいて文節間の格関係を解析す
る。並立構造解析部１０は係り受け解析部８からの信号
に基づいて文章全体の並立構造を解析して抽出する。係
り受け解析結果保持部１１は係り受け解析部８で解析さ
れた解析結果を保持する。係り受け解析結果出力部１２
は係り受け解析結果保持部１１により保持された保持内
容を表示部や印刷部等の出力部（不図示）に出力する。
１３は解析辞書であって、形態素解析部２、文節種判定
部６、係り受け解析部８、格関係解析部９、及び並立構
造解析部１０が夫々の解析実行時に参照する。In the figure, 1 is a character input unit such as a keyboard for inputting a character string to be analyzed. The morpheme analysis unit 2 performs morpheme analysis on the character string input by the character input unit 1 and divides the character string into words. The morpheme analysis result holding unit 3 holds the analysis result analyzed by the morpheme analysis unit 2. The clause generation unit 4 uses the morphological analysis result holding unit 3
A bunsetsu is generated based on the held contents held in. The phrase holding unit 5 holds the phrase generated by the phrase generation unit 4.
The bunsetsu type judging unit 6 judges the bunsetsu type for each of the bunsetsu held in the bunsetsu holding unit 5, divides the bunsetsu for each bunsetsu type, and writes the result in the bunsetsu holding unit 5. The analysis method determination unit 7 determines the analysis method of the phrase stored in the phrase storage unit 5 based on the type of phrase obtained by the phrase type determination unit 6. The dependency analysis unit 8 analyzes the dependency of the phrase held in the phrase holding unit 5 based on the analysis method determined by the analysis method determination unit 7. The case relation analysis unit 9 analyzes the case relation between phrases based on the signal from the dependency analysis unit 8. The parallel structure analysis unit 10 analyzes and extracts the parallel structure of the entire sentence based on the signal from the dependency analysis unit 8. The dependency analysis result holding unit 11 retains the analysis result analyzed by the dependency analysis unit 8. Dependency analysis result output unit 12
Outputs the contents held by the dependency analysis result holding unit 11 to an output unit (not shown) such as a display unit or a printing unit.
An analysis dictionary 13 is referred to by the morphological analysis unit 2, the clause type determination unit 6, the dependency analysis unit 8, the case relation analysis unit 9, and the side-by-side structure analysis unit 10 when executing each analysis.

【００１９】図２は上記文字処理装置における文字列の
解析方法の処理手順を示したフローチャートである。FIG. 2 is a flow chart showing a processing procedure of a character string analyzing method in the character processing apparatus.

【００２０】ステップＳ１では形態素解析部２で入力さ
れた文字列に対する形態素解析を所定の品詞に基づいて
行い、該文字列を単語に分割し、かかる分割された単語
を形態素解析結果保持部３に保持する。本実施の形態で
は形態素解析に使用される品詞としては、例えば、名
詞、動詞、形容詞、形容動詞、副詞、連体詞、接続詞、
助詞、助動詞の９種類が使用される。In step S1, morphological analysis is performed on the character string input by the morphological analysis unit 2 based on a predetermined part of speech, the character string is divided into words, and the divided words are stored in the morphological analysis result holding unit 3. Hold. As the part of speech used for morphological analysis in the present embodiment, for example, nouns, verbs, adjectives, adjective verbs, adverbs, adnominals, conjunctions,
Nine types of particles and auxiliary verbs are used.

【００２１】ステップＳ２では文節生成部４で形態素解
析結果保持部３に保持されている単語から文節を生成
し、その文節を文節保持部５に保持する。ここで、文節
は１個の自立語からなり、必要に応じて付属語が付せら
れる。また、本実施の形態では自立語としては名詞、動
詞、形容詞、形容動詞、副詞、連体詞、接続詞の７種類
が使用され、付属語としては助詞、助動詞の２種類が使
用される。In step S2, the phrase generation unit 4 generates a phrase from the words held in the morphological analysis result holding unit 3 and holds the phrase in the phrase holding unit 5. Here, each bunsetsu consists of one independent word, and an attached word is added as necessary. In this embodiment, seven types of nouns, verbs, adjectives, adjectives, adverbs, adnominals, and conjunctions are used as independent words, and two types of particles, auxiliary verbs, are used as adjuncts.

【００２２】ステップＳ３では文節種判定部６で文節種
保持部５に保持されている文節を種類別に分類し、文節
がいずれの文節種に属するかを判定する。In step S3, the bunsetsu type determining unit 6 classifies the bunsetsu held in the bunsetsu type holding unit 5 by type, and determines which bunsetsu type the bunsetsu belongs to.

【００２３】具体的には、図３に示すように、文節の種
類を「用言」、「体言」、「その他」に分類し、さらに
これら「用言」、「体言」、「その他」を夫々「連用修
飾型」、「連体修飾型」に分類する。ここで、「用言」
は自立語の品詞が動詞、形容詞、形容動詞の文節とし、
「体言」は自立語の品詞が名詞の文節とする。また、
「その他」は自立語の品詞が副詞、連体詞、接続詞の文
節とする。そして夫々の文節が「連用修飾型」である
か、或いは「連体修飾型」であるかは文節を構成する最
後の単語の修飾型と同一とする。そして各単語（活用す
る単語についてはその活用形）毎に修飾型が「連用修飾
型」であるか、或いは「連体修飾型」であるかを解析辞
書１３に登録する。Specifically, as shown in FIG. 3, the types of bunsetsu are classified into "synonyms", "hymns", and "others", and these "hymns", "hymns", and "others" are further classified. They are classified into "continuous modification type" and "complex modification type", respectively. Where "definition"
Is the independent part of speech is a verb, an adjective, or an adjective clause
"Symptom" is a phrase whose noun is a part-of-speech of an independent word. Also,
"Others" is a phrase in which the part-of-speech of an independent word is an adverb, a conjunction, or a conjunction. Whether each bunsetsu is the "continuous modification type" or the "adnominal modification type" is the same as the modification type of the last word constituting the bunsetsu. Then, it is registered in the analysis dictionary 13 whether the modification type is the "continuous modification type" or the "adhesion modification type" for each word (the usage form of the word to be used).

【００２４】ステップＳ４では文節種判定部６で判定さ
れた文節の種類に基づいて解析方式判定部７でその解析
方式を判定する。In step S4, the analysis method determination unit 7 determines the analysis method based on the type of the phrase determined by the phrase type determination unit 6.

【００２５】具体的には、図４に示すように、解析方式
判定部７において、入力された文字列に応じ、係り受け
解析のみを行うか、或いは格関係解析や並立構造解析を
も行うか否かを判定する。かかる判定基準は、これら格
関係解析や並立構造解析が有効な文字列として本実施の
形態では、係り受け解析については全ての文節に対して
常に行われ、格関係解析については用言の個数が「２」
個以上であって且つ連用修飾型体言の個数が「１」個以
上有るか否か、並立構造解析については文節の個数が
「１０」個以上か否かにより判定される。Specifically, as shown in FIG. 4, in the analysis method determination unit 7, whether the dependency analysis only is performed or the case relation analysis and the parallel structure analysis are also performed according to the input character string. Determine whether or not. In the present embodiment, such a criterion is a character string for which case relation analysis and parallel structure analysis are effective, and in this embodiment, dependency analysis is always performed for all clauses, and case relation analysis is performed using the number of terms. "2"
It is determined whether or not the number is more than or equal to 1 and the number of continuous modified type words is "1" or more, and for the parallel structure analysis, the number of clauses is "10" or more.

【００２６】ステップＳ５では係り受け解析部８で上述
した解析方式判定部７でなされた解析方式の判定結果に
基づき文字列を解析し、その解析結果を係り受け解析結
果保持部１１に保持する。In step S5, the dependency analysis unit 8 analyzes the character string based on the determination result of the analysis method performed by the above-described analysis method determination unit 7, and holds the analysis result in the dependency analysis result holding unit 11.

【００２７】ステップＳ６では係り受け解析結果出力部
１２が係り受け解析結果を音声合成や機械翻訳等のアプ
リケーション・ソフトに出力し処理を終了する。In step S6, the dependency analysis result output unit 12 outputs the dependency analysis result to application software such as voice synthesis or machine translation, and the process is terminated.

【００２８】このように本文字処理装置においては、係
り受け解析を常に実行する一方、計算量の大きい格関係
解析や並立構造解析に関しては文字列の内容に応じてこ
れら格関係解析や並立構造解析が有効な文字列に対して
のみ実行するので計算量の削減を図ることができ、文節
間の係り受け解析に要する時間を低減することができ
る。そして、これにより、インターネット等の情報通信
網から入手される大量の文書に対しても計算量の増大に
伴う音声合成時や機械翻訳時の不都合を極力回避するこ
とができる。As described above, in the present character processing apparatus, the dependency analysis is always executed, while the case relation analysis and the parallel structure analysis, which require a large amount of calculation, are carried out depending on the contents of the character string. Since it is executed only for valid character strings, the amount of calculation can be reduced, and the time required for dependency analysis between clauses can be reduced. Thus, even for a large amount of documents obtained from an information communication network such as the Internet, it is possible to avoid the inconvenience at the time of speech synthesis or machine translation due to the increase of the calculation amount.

【００２９】尚、本発明は上記実施の形態に限定される
ものではなく、上記実施の形態では解析方式として係り
受け解析、格関係解析、及び並立構造解析の３種類を使
用したが、副詞の呼応解析等の他の解析方式を適宜追加
するのも好ましい。The present invention is not limited to the above-described embodiment, and in the above-described embodiment, three types of analysis, dependency analysis, case relation analysis, and parallel structure analysis are used. It is also preferable to appropriately add other analysis methods such as response analysis.

【００３０】また、上記実施の形態では格関係解析や並
立構造解析については所定要件を満足するときのみ実行
しているが、使用する解析方式をユーザが指定するよう
にしてもよい。すなわち、例えば、格関係解析は上述し
た解析方式判定部７の判定如何に拘わらず、常に実行す
るようにユーザ側で指定可能とするのも好ましい。In the above embodiment, the case relationship analysis and the parallel structure analysis are executed only when the predetermined requirements are satisfied, but the analysis method to be used may be specified by the user. That is, for example, it is also preferable that the user can specify that the case relationship analysis should always be executed regardless of the determination made by the analysis method determination unit 7 described above.

【００３１】[0031]

【発明の効果】以上詳述したように本発明に係る文字処
理装置と文字列の解析方法によれば、複数の文字列解析
手段又は文字列解析ステップの中から必要に応じて少な
くとも１個以上の文字列解析手段を特定するので、格関
係解析や並立構造解析のような計算量が大きい解析手法
に対しては所定の文字列入力に対してのみ実行すること
が可能となり、計算量の削減を図ることができ、文節間
の係り受け解析に要する時間を低減することができる。
そして、これにより、インターネット等の情報通信網か
ら入手される大量の文書に対しても計算量の増大に伴う
音声合成や機械翻訳の不都合を極力回避することができ
る。As described above in detail, according to the character processing device and the character string analyzing method of the present invention, at least one or more of a plurality of character string analyzing means or character string analyzing steps are required. Since the character string analysis means of is specified, it is possible to execute only for a predetermined character string input for analysis methods with a large amount of calculation such as case relation analysis and parallel structure analysis, which reduces the amount of calculation. Therefore, it is possible to reduce the time required for dependency analysis between clauses.
Thus, even for a large amount of documents obtained from an information communication network such as the Internet, it is possible to avoid the inconvenience of speech synthesis or machine translation due to an increase in calculation amount as much as possible.

[Brief description of drawings]

【図１】本発明に係る文字処理装置の一実施の形態を示
すブロック構成図である。FIG. 1 is a block configuration diagram showing an embodiment of a character processing device according to the present invention.

【図２】本発明に係る文字列の解析方法の処理手順を示
すフローチャートである。FIG. 2 is a flowchart showing a processing procedure of a character string analysis method according to the present invention.

【図３】文節の種類を示す図である。FIG. 3 is a diagram showing types of clauses.

【図４】解析方式の判定基準を示す図である。FIG. 4 is a diagram showing criteria for analysis method determination.

[Explanation of symbols]

２形態素解析部（形態素解析手段）４文節生成部（文節生成手段）６文節種判定部（文節種判定手段）７解析方式判定部（解析方式特定手段）８係り受け解析部（係り受け解析手段）９格関係解析部（格関係解析手段）１０並立構造解析部（並立構造解析手段） 2 Morphological analysis unit (morphological analysis unit) 4 Phrase generation unit (Phrase generation unit) 6 Phrase type determination unit (Phrase type determination unit) 7 Analysis method determination unit (Analysis method specifying unit) 8 Dependency analysis unit (Dependency analysis unit) ) 9 Case Relationship Analysis Section (Case Relationship Analysis Means) 10 Parallel Structure Analysis Section (Parallel Structure Analysis Means)

Claims

[Claims]

1. A morpheme analysis unit that morphologically analyzes an input character string and divides it into words, a phrase generation unit that generates a phrase from a word analyzed by the morpheme analysis unit, and a morphological unit of the input character string. In a character processing device including a plurality of different character string analysis means for performing analysis, at least one character is selected from the plurality of character string analysis means according to the content of the clause generated by the clause generation means. A character processing device having an analysis method specifying unit for specifying a column analyzing unit.

2. A partitioning means for partitioning a large number of clauses into a plurality of different clause types, and which of the plurality of clause categories the clause generated by the clause generating means belongs to is determined. The character processing device according to claim 1, further comprising: a phrase type determining unit, wherein the analysis method identifying unit is executed based on the phrase type determined by the phrase type determining unit.

3. The plurality of character string analysis units, at least a dependency analysis unit that analyzes a dependency relation between phrases, a case relation analysis unit that analyzes a case relation between phrases, and a parallel structure of entire character strings. 3. The character according to claim 1 or 2, further comprising a parallel structure analysis means for analyzing and extracting the character, wherein the analysis method specifying means specifies an analysis method from these character string analyzing means. Processing equipment.

4. The character processing device according to claim 3, wherein the dependency analysis unit is always identified by the analysis method identifying unit.

5. The case relation analysis means analyzes the case when the number of phrases in the character string is equal to or larger than a first predetermined number and the number of continuous modified type phrases is equal to or larger than a second predetermined number. 4. The method is specified by the method specifying means.
Alternatively, the character processing device according to claim 4.

6. The parallel structure analysis means is specified by the analysis method specifying means when the number of clauses in the character string is equal to or larger than a third predetermined number.
The character processing device according to claim 5.

7. The character processing device according to claim 3, wherein the case relationship analysis unit executes case relationship analysis based on a command from the dependency analysis unit. .

8. The character processing device according to claim 3, wherein the parallel structure analysis unit executes the parallel structure analysis based on a command from the dependency analysis unit. .

9. A morphological analysis step of morphologically analyzing an input character string into words, a phrase generation step of generating a phrase from the words analyzed by the morphological analysis step, and a morphological analysis of the input character string. In a method of analyzing a character string including a plurality of different character string analyzing steps for performing analysis, at least one or more of the plurality of character string analyzing steps are selected from among the plurality of character string analyzing steps according to the content of the clause generated by the clause generating step. A method for analyzing a character string, comprising an analysis method specifying step for specifying a character string analyzing step.

10. A clause type determining step of determining which of the plurality of clause categories the clause generated by the clause generating step belongs to, and the analysis method specifying step includes a clause type determining step. The character string analysis method according to claim 9, wherein the method is executed based on the phrase type determined in step.

11. The plurality of character string analysis steps include at least a dependency analysis step of analyzing a dependency relation between phrases, a case relation analysis step of analyzing a case relation between phrases, and a parallel structure of entire character strings. 11. The character according to claim 9 or 10, further comprising: a parallel structure analysis step of analyzing and extracting the character, wherein the analysis method specifying step specifies an analysis method from these character string analyzing steps. How to parse the column.

12. The character string analysis method according to claim 11, wherein the analysis method specifying step always specifies the dependency analysis step.

13. The analysis method specifying step, when the number of phrases in the character string is equal to or greater than a first predetermined number and the number of consecutive modified type phrases is equal to or greater than a second predetermined number, the case relationship analysis is performed. 12. A step is specified, and the step is specified.
Alternatively, the character string analysis method according to claim 12.

14. The analysis method specifying step specifies the parallel structure analysis step when the number of clauses in the character string is equal to or larger than a third predetermined number.
A method for analyzing a character string according to claim 13.

15. The character string according to claim 11, wherein the case relationship analysis step executes case relationship analysis based on a command from the dependency analysis step. analysis method.

16. The parallel structure analysis step executes the parallel structure analysis based on a command from the dependency analysis step.
The method of parsing the character string described in any of.